Foundations of learning in autonomous agents

doi:10.1016/0921-8890(91)90018-G

Robotics and Autonomous Systems

Volume 8, Issues 1–2, November 1991, Pages 131-144

https://doi.org/10.1016/0921-8890(91)90018-G Get rights and content

Abstract

Autonomous agents must learn to act in complex, noisy domains. This paper provides a formal description of the problem of building autonomous agents that learn to act and describes a common framework for the specification of learning algorithms for autonomous agents. In addition, it provides correctness, convergence, and complexity metrics for learning algorithms for autonomous agents. These metrics allow algorithms based on a wide variety of programming paradigms to be objectively compared.

References (28)

L. Blum et al.
Towards and mathematical theory of inductive inference
Information and Control
(1975)
E.M. Gold
Language identification in the limit
Information and Control
(1967)
E.M. Gold
Complexity of automaton identification from given data
Information and Control
(1978)
J.J. Grefenstette
Incremental learning of control strategies with genetic algorithsms
T.M. Mitchell
Generalization as search
Artificial Intelligence
(1982)
R.L. Rivest et al.
A new approach to unsupervised learning in deterministic environments
D.W. Aha et al.
Noise tolerant instance-based learning algorithms
D. Angluin et al.
Learning from noisy examples
Machine Learning
(1988)
A.G. Barto et al.
Pattern recognizing stochastic learning automata
IEEE Transactions on Systems, Man and Cybernetics
(1985)
D.A. Berry et al.
Bandit Problems: Sequential Allocation of Experiments
(1985)

w. Buntine

A critique of the Valiant model

D. Haussler

New theoretical directions in machine learning

Machine Learning

(1988)

J.G. Kemeny et al.

Finite Markov Chains

(1976)

N. Littlestone

Learning quickly when irrelevant attributes abound: A new linear threshold algorithm

Machine Learning

(1988)

Cited by (12)

Self-Organization and Autonomous Robots
2012, Neural Systems for Robotics
This chapter is related to autonomous mobile robots that have to construct a representation from different sensory situations to appropriate actions in order to navigate in their environments without colliding with obstacles. Equipped with a self-organizing controller, a mobile robot can learn action-oriented behaviors that are difficult to realize in another manner. We propose several self-organizing map (SOM)-based algorithms for action-oriented mobile robots that must learn a mapping from different sensory situations to proper actions on the basis of the robot's own observations. These observations are acquired from example movements provided by a teacher, as well as from the robot's own experiences while navigating. The viability of the proposed control strategy for an action-oriented mobile robot is demonstrated with a simulated robot that navigates in its environment. The robot can utilize both outputs of range sensors and visual information. In addition, a number of fundamental issues in artificial intelligence are discussed, including different approaches to problem solving in sensory-based tasks, the role of artificial neural networks, and especially the importance of self-organization principles in sensory information processing.
Adaptive neurofuzzy control of a robotic gripper with on-line machine learning
2004, Robotics and Autonomous Systems
Pre-programming complex robotic systems to operate in unstructured environments is extremely difficult because of the programmer’s inability to predict future operating conditions in the face of unforeseen environmental conditions, mechanical wear of parts, etc. The solution to this problem is for the robot controller to learn on-line about its own capabilities and limitations when interacting with its environment. At the present state of technology, this poses a challenge to existing machine learning methods. We study this problem using a simple two-fingered gripper which learns to grasp an object with appropriate force, without slip while minimising chances of damage to the object. Three machine learning methods are used to produce a neurofuzzy controller for the gripper. These are off-line supervised neurofuzzy learning and two on-line methods, namely unsupervised reinforcement learning and an unsupervised/supervised hybrid. With the two on-line methods, we demonstrate that the controller can learn through interaction with its environment to overcome simulated failure of its sensors. Further, the hybrid is shown to out perform reinforcement learning alone in terms of faster adaptation to the changing circumstances of sensor failure. The hybrid learning scheme allows us to make best use of such pre-labeled datasets as might exist and to remember effectively good control actions discovered by reinforcement learning.
ARBIB: An autonomous robot based on inspirations from biology
2000, Robotics and Autonomous Systems
Citation Excerpt :
Learning then acts to maximize the reward or return associated with or predicted from the reinforcement signal over time. The many variants of this form of learning (especially the Q-learning paradigm of Watkins [79] and Watkins and Dayan [80]) have been very popular in animat studies (e.g. [37,39,44,48,52,70]). Reinforcement learning is similar to the S–R theory of conditioning [35, p. 350] which posits a direct link between the conditioned stimulus and response, in contrast to Pavlovian S–S theory which posits association of two stimuli — the CS and the US.
Simple artificial creatures (‘animats’), which operate as autonomous, adaptive robots in the real world, can serve both as models of biology and as a radical alternative to conventional methods of designing intelligent systems. We describe the evolution and implementation of the autonomous robot ARBIB, which learns from and adapts to its environment. A primary goal was to test the notion that effective robot learning can be based on neural habituation and sensitization, so validating the suggestion of Hawkins and Kandel that (associative) classical and ‘higher-order’ conditioning might be based on an elaboration of these (non-associative) forms of learning. Accordingly, ARBIB’s ‘nervous system’ has a non-homogeneous population of spiking neurons, and learning is by modification of basic, pre-existing (‘hard-wired’) reflexes. By monitoring firing rates of specific neurons and synaptic weights between neural connections as ARBIB learns from its environment, we confirm that both classical and higher-order conditioning occur, leading to the emergence of interesting and ecologically valid behaviors.
Fuzzy logic controller design utilizing multiple contending software agents
1999, Fuzzy Sets and Systems
The concept of fuzzy logic control based on a distributed multi-agent system is described in this paper. It is shown that a system composed of a number of cooperating and contending intelligent agents can perform well in terms of sensing the measured input data and producing output control signals. Each agent's implementation consists of a number of fuzzy logic control rules. Although at each time step, all the agents compete for the control of the system by proposing a possible action, the main controller chooses among the selections based on a number of evaluation criteria. It is through the simultaneous cooperation (to overall control the system) and competition (to take control at a given time), that the system yields convincing results. The effectiveness of the novel methodology is illustrated through the control of a robotic system that keeps an inverted pendulum in balance while being perturbed by a human force. In addition, the proposed methodology is applied to the practical control of a service mobile manipulator interacting with a human. It is shown that fuzzy logic control combined with intelligent multi-agent architecture can yield effective and efficient intelligent controllers.
Case-based reactive navigation: A method for on-line selection and adaptation of reactive robotic control parameters
1997, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
TOWARDS INCREMENTAL AUTONOMY FRAMEWORK FOR ON-ORBIT VISION-BASED GRASPING
2021, Proceedings of the International Astronautical Congress, IAC

View all citing articles on Scopus

^☆: This work was supported in a part by a gift from the System Development Foundation and in part by the Air force Office of Scientific Research under contract # F49620-89-C-0055

View full text

Foundations of learning in autonomous agents☆

Abstract

Information and Control

Information and Control

Information and Control

Artificial Intelligence

Noise tolerant instance-based learning algorithms

Learning from noisy examples

Machine Learning

Pattern recognizing stochastic learning automata

IEEE Transactions on Systems, Man and Cybernetics

Bandit Problems: Sequential Allocation of Experiments

A critique of the Valiant model

New theoretical directions in machine learning

Machine Learning

Finite Markov Chains

Learning quickly when irrelevant attributes abound: A new linear threshold algorithm

Machine Learning