Embedding decision-analytic control in a learning architecture

doi:10.1016/0004-3702(91)90008-8

Artificial Intelligence

Volume 49, Issues 1–3, May 1991, Pages 129-159

https://doi.org/10.1016/0004-3702(91)90008-8 Get rights and content

Abstract

An autonomous agent's control problem is often formulated as the attempt to minimize the expected cost of accomplishing a goal. This paper presents a three-dimensional view of the control problem that is substantially more realistic. The agent's control policy is assessed along three dimensions: deliberation cost, execution cost, and goal value. The agent must choose which goal to attend to as well as which action to take. Our control policy seeks to maximize satisfaction by trading execution cost and goal value while keeping deliberation cost low. The agent's control decisions are guided by the MU heuristic—choose the alternative whose marginal expected utility is maximal. Thus, when necessary, the agent will prefer easily-achieved goals to attractive but difficult-to-attain alternatives. The MU heuristic is embedded in an architecture with record-keeping and learning capabilities. The architecture offers its control module expected utility and expected cost estimates that are gradually refined as the agent accumulates experience. A programmer is not required to supply that knowledge, and the estimates are provided without recourse to distributional assumptions.

References (39)

J.A. Barnett
How much is control knowledge worth? a primitive example
Artif. Intell.
(1984)
O. Etzioni
Hypothesis filtering: a practical approach to reliable learning
S. Minton et al.
Explanation-based learning: a problem-solving perspective
Artif. Intell.
(1989)
S. Russell et al.
Principles of metareasoning
D.E. Smith
Controlling backward inference
Artif. Intell.
(1989)
R.F. Sproull
Strategy construction using a synthesis of heuristic and decision-theoretic methods
M. Wellman
Formulation of tradeoffs in planning under uncertainty
B. Abramson et al.
A model of two-player evaluation functions
L. Breiman
D. Chapman
Planning for conjunctive goals
Artif. Intell.
(1987)

T. Dean

Decision-theoretic control of inference for time-critical applications

(1990)

T. Dean et al.

An analysis of time-dependent planning

J. Doyle

Big problems for artificial intelligence

AI Mag.

(1988)

J. Doyle

Rationality and its roles in reasoning (extended abstract)

O. Etzioni et al.

A comparative analysis of chunking and decision-analytic control

M.R. Fehling et al.

Adaptive planning and search

E.A. Feigenbaum et al.

Performance of a reading task by an elementary perceiving and memorizing program

Behav. Sci.

(1963)

M.R. Garey et al.

P. Haddawy et al.

Issues in decision-theoretic planning: symbolic goals and numeric utilities

(1990)

Cited by (34)

A deliberative scheduling technique for a real-time agent architecture
2006, Engineering Applications of Artificial Intelligence
Citation Excerpt :
This set of algorithms or methods is used to make tradeoffs in computation time vs. quality and may have different performance characteristics in different environmental situations. There are several works on the multiple-method approach such as the works shown by Garvey and Lesser (1993), Bonnisone and Halverson (1990) and Etzioni (1991), they all focus on designing a solution that uses all available resources to maximize the quality within the available time based on the existence of multiple methods for many tasks. The multiple-method approach has at least two potential advantages over an anytime algorithm approach (Garvey and Lesser, 1994).
In this paper, we present a heuristic to schedule complex task models (tasks that use Artificial Intelligence methods). These tasks are used in a real-time agent architecture called ARTIS. This architecture has been designed to build intelligent agents that work in hard real-time environments. To do this, the architecture provides scheduling at two levels. The first level assures the fulfilment of the hard temporal requirements, and the second level obtains a result of higher quality. The new heuristic, Slack-Slide Scheduling (SSS), works at the second level. It manages two types of methods: progressive refinement methods and multiple methods. The Slack-Slide Scheduling also attempts to reuse previous results in order to make better use of the existing CPU time while the first-level scheduler fulfils the deadlines.
Metacognition in computation: A selected research review
2005, Artificial Intelligence
Various disciplines have examined the many phenomena of metacognition and have produced numerous results, both positive and negative. I discuss some of these aspects of cognition about cognition and the results concerning them from the point of view of the psychologist and the computer scientist, and I attempt to place them in the context of computational theories. I examine metacognition with respect to both problem solving (e.g., planning) and to comprehension (e.g., story understanding) processes of cognition.
Searching stochastically generated multi-abstraction-level design spaces
2001, Artificial Intelligence
We present a new algorithm called Highest Utility First Search (HUFS) for searching trees characterized by a large branching factor, the absence of a heuristic to compare nodes at different levels of the tree, and a child generator that is both expensive to run and stochastic in nature. Such trees arise naturally, for instance, in problems which involve candidate designs at several levels of abstraction and which use stochastic optimizers such as genetic algorithms or simulated annealing to generate a candidate at one level from a parent at the previous level. HUFS is applicable when there is a class of related problems, from which many specific problems will need to be solved. This paper explains the HUFS algorithm and presents experimental results comparing HUFS with alternative methods.
Accounting for the Computational Basis of Consciousness: A Connectionist Approach
1999, Consciousness and Cognition
This paper argues for an explanation of the mechanistic (computational) basis of consciousness that is based on the distinction between localist (symbolic) representation and distributed representation, the ideas of which have been put forth in the connectionist literature. A model is developed to substantiate and test this approach. The paper also explores the issue of the functional roles of consciousness, in relation to the proposed mechanistic explanation of consciousness. The model, embodying the representational difference, is able to account for the functional role of consciousness, in the form of the synergy between the conscious and the unconscious. The fit between the model and various cognitive phenomena and data (documented in the psychological literatures) is discussed to accentuate the plausibility of the model and its explanation of consciousness. Comparisons with existing models of consciousness are made in the end.
Utility-based on-line exploration for repeated navigation in an embedded graph
1998, Artificial Intelligence
In this paper, we address the tradeoff between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more effectively. For example, a robot may need to learn the most efficient routes between important sites in its environment. We compare on-line and off-line exploration for a repeated task, where the agent is given some particular task to perform some number of times. Tasks are modeled as navigation on a graph embedded in the plane. This paper describes a utility-based on-line exploration algorithm for repeated tasks, which takes into account both the costs and potential benefits (over future task repetitions) of different exploratory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed on each task repetition. We experimentally evaluated our utility-based on-line algorithm against a heuristic search algorithm for off-line exploration as well as a randomized on-line exploration algorithm. We found that for a single repeated task, utility-based on-line exploration consistently outperforms the alternatives, unless the number of task repetitions is very high. In addition, we extended the algorithms for the case of multiple repeated tasks, where the agent has a different randomly-chosen task to perform each time. Here too, we found that utility-based on-line exploration is often preferred.
Deriving consensus in multiagent systems
1996, Artificial Intelligence
Consider the designers of a multiagent environment, who are charged with establishing the rules by which agents in an encounter will interact. Once the rules of encounter have been determined, each builder of each agent is free to design his own machine any way that he wants. However, the rules that were established will certainly affect the choices he makes in building his own agent.
In this article we suggest an economic decision process that can be used to derive multiagent consensus, namely, the Clarke tax mechanism (E.H. Clarke, 1971). Consensus is reached through the process of voting; each agent expresses its preferences, and a group choice mechanism is used to select the result. Clarke tax-like mechanisms provide a set of attractive alternatives for the designers of multiagent environments, particularly if those environments consist of individually motivated heterogeneous agents.
The Clarke tax mechanism has many desirable properties such as non-manipulability, individual rationality, and maximization of the agents' global utility. However, though theoretically attractive, the Clarke tax presents a number of difficulties when one attempts to use it in practical implementations. This article examines how the Clarke tax could be used as an effective consensus mechanism in domains consisting of automated agents. In particular, we consider how agents can come to a consensus without needing to reveal full information about their preferences, and without needing to generate alternatives prior to the voting process.

View all citing articles on Scopus

View full text

Embedding decision-analytic control in a learning architecture

Abstract

Artif. Intell.

Artif. Intell.

Artif. Intell.

A model of two-player evaluation functions

Planning for conjunctive goals

Artif. Intell.

Decision-theoretic control of inference for time-critical applications

An analysis of time-dependent planning

Big problems for artificial intelligence

AI Mag.

Rationality and its roles in reasoning (extended abstract)

A comparative analysis of chunking and decision-analytic control

Adaptive planning and search

Performance of a reading task by an elementary perceiving and memorizing program

Behav. Sci.

Issues in decision-theoretic planning: symbolic goals and numeric utilities