Abstract
Classical conditioning is a basic learning mechanism in animals and can be found in almost all organisms. If we want to construct robots with abilities matching those of their biological counterparts, this is one of the learning mechanisms that needs to be implemented first. This article describes a computational model of classical conditioning where the goal of learning is assumed to be the prediction of a temporally discounted reward or punishment based on the current stimulus situation.
The model is well suited for robotic implementation as it models a number of classical conditioning paradigms and learning in the model is guaranteed to converge with arbitrarily complex stimulus sequences. This is an essential feature once the step is taken beyond the simple laboratory experiment with two or three stimuli to the real world where no such limitations exist. It is also demonstrated how the model can be included in a more complex system that includes various forms of sensory pre-processing and how it can handle reinforcement learning, timing of responses and function as an adaptive world model.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Balkenius, C. 1995. Natural intelligence in artificial creatures, Lund University Cognitive Studies 37.
Balkenius, C. 1996. Generalization in instrumental learning. In From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, and S.W. Wilson (Eds.), MIT Press/Bradford Books: Cambridge, MA.
Balkenius, C. 1998. A neural network model of classical conditioning I: The dynamics of learning, Lund University Cognitive Studies 68.
Balkenius, C. and Morén, J. 1998. Computational models of classical conditioning: A comparative study. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, R. Pfeifer, B. Blumberg, J.-A. Meyer, and S.W. Wilson (Eds.), MIT Press/Bradford Books: Cambridge, MA.
Balleine, B. 1992. Instrumental performance following a shift in primary motivation depends on incentive learning. Journal of Experimental Psychology: Animal Behavior Processes, 18:236-250.
Balleine, B., Garner, C., Gonzalez, F., and Dickinson, A. 1995. Motivational control of heterogeneous instrumental chains. Journal of Experimental Psychology: Animal Behavior Processes, 21:203-217.
Barto, A.G., Sutton, R.S., and Watkins, C.J.C.H. 1990. Learning and sequential decision making. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 539-602.
Bouton, M.E. 1991. Context and retrieval in extinction and in other examples of interference in simple associative learning. In Current Topics in Animal Learning: Brain, Emotion and Cognition, L. Dachowski and C.F. Flaherty (Eds.), Erlbaum: Hillsdale, NJ.
Desmond, J.E. 1990. Temporally adaptive responses in neural models: The stimulus trace. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 421-456.
Donahoe, J.W. and Palmer, D.C. 1994. Learning and Complex Behavior, Allyn & Bacon: Boston.
Gaffan, D. 1992. Amygdala and the memory of reward. In The Amygdala: Neurobiological Aspects of Emotion, Memory, and Mental Dysfunction, J.P. Aggleton (Ed.), Wiley-Liss: New York, pp. 471-484.
Gallistel, C.R. 1990. The Organization of Learning, MIT Press: Cambridge, MA.
Gray, J.A. 1975. Elements of a Two-Process Theory of Learning, Academic Press: London.
Grossberg, S. 1974. Classical and Instrumental Learning by Neural Networks. In Progress in Theoretical Biology, Academic Press: New York, Vol. 3.
Grossberg, S. 1987. The Adaptive Brain, North-Holland: Amsterdam.
Hassoun, M.H. 1995. Fundamentals of Artificial Neural Networks, MIT Press: Cambridge, MA.
Holland, P.C. 1992. Occasion setting in Pavlovian conditioning. In The Psychology of Learning and Motivation, D. Medin (Ed.), Academic Press: San Diego, CA, Vol. 28, pp. 69-125.
Hull, C.L. 1932. The goal-gradient hypothesis and maze learning. Psychological Review, 39(1):25-43.
Kaelbling, L.P., Littman, M.L., and Moore, A.W. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237-285.
Kamin, L.J. 1968. Attention-like processes in classical conditioning. In Miami Symposium on the Prediction of Behavior: Aversive Stimulation, M.R. Jones (Ed.), University of Miami Press: Miami, pp. 9-31.
Kehoe, E.J. 1982. Conditioning with serial compound stimuli: Theoretical and empirical issues. Experimental Animal Behavior, 1:30-65.
Kehoe, E.J. 1990. Classical conditioning: Fundamental issues for adaptive network models. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 390-420.
Klopf, A.H. 1988. A neuronal model of classical conditioning. Psychobiology, 16(2):85-125.
Klopf, A.H., Morgan, J.S., and Weaver, S.E. 1993. A hierarchical network of control systems that learn: Modeling nervous system function during classical and instrumental conditioning. Adaptive Behavior, 1(3):263-319.
Machado, A. 1997. Learning the temporal dynamics of behavior. Psychological Review, 104:241-265.
Mackintosh, N.J. 1975. A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82:276-298.
Mackintosh, N.J. 1983. Conditioning and Associative Learning, Oxford University Press: Oxford.
Minsky, M. and Papert, S. 1988. Perceptrons-Expanded Edition, MIT Press: Cambridge, MA.
Moore, J.W. and Choi, J.-C. 1998. Conditioned stimuli are occasion setters. In Occasion Setting: Associative Learning and Cognition in Animals, N.A. Schmajuk and P.C. Holland (Eds.), American Psychological Association: Washington, D.C.
Mowrer, O.H. 1973. Learning Theory and Behavior, Wiley: New York.
Pavlov, I.P. 1927. Conditioned Reflexes, Oxford University Press: Oxford.
Rescorla, R.A. and Wagner, A.R. 1972. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. In Classical Conditioning II: Current Research and Theory, A.H. Black and W.F. Prokasy (Eds.), Appleton-Century-Crofts: New York, pp. 64-99.
Rescorla, R.A. 1985. Inhibition and facilitation. In Information Processing in Animals: Conditioned Inhibition, R.R. Miller and N.E. Spear (Eds.), Erlbaum: Hillsdale, NJ, pp. 299-326.
Rosenblatt, F. 1959. Principles of Neurodynamics, Spartan Books: New York.
Schmajuk, N.A. 1990. Role of the hippocampus in temporal and spatial navigation: An adaptive neural network. Behavioral Brain Research, 39:205-229.
Schmajuk, N.A. and DiCarlo, J.J. 1992. Stimulus configuration, classical conditioning, and hippocampal function. Psychological Review, 99:268-305.
Schmajuk, N.A. and Thieme, A.D. 1992. Purposive behavior and cognitive mapping: A neural network model. Biological Cybernetics, 67:165-174.
Schmajuk, N.A., 1997. Animal Learning and Cognition: A Neural Network Approach, Cambridge University Press.
Schmajuk, N.A. and Holland, P.C. (Eds.). 1998. Occasion Setting: Associative Learning and Cognition in Animals, American Psychological Association: Washington, DC.
Schneiderman, N. 1966. Interstimulus interval function of the nictitating membrane response of the rabbit under delay versus trace conditioning. Journal of Comparative and Physiological Psychology, 62:397-402.
Schneiderman, N., Fuentes, I., and Gormezano, I. 1962. Acquisition and extinction of the classically conditioned eyelid response in the albino rabbit. Science, 136:650-652.
Schneiderman, N. and Gormezano, I. 1964. Conditioning of the nictitating membrane of the rabbit as a function as the CS-US interval. Journal of Comparative and Physiological Psychology, 57:188-195.
Smith, M.C., Coleman, S.R., and Gormezano, I. 1969. Classical conditioning of the rabbits nictitating membrane response at backward, simultaneous, and forward CS-US interval. Journal of Comparative Physiological Psychology, 69:226-231.
Staddon, J.E.R and Higa, J.J. 1996. Multiple time scales in simple habituation. Psychological Review, 103:720-733.
Sutton, R.S. and Barto, A.G. 1990. Time-derivative models of Pavlovian reinforcement. In Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore (Eds.), MIT Press: Cambridge, MA, pp. 497-538.
Sutton, R.S. and Barto, A.G. 1998. Reinforcement Learning: An Introduction, MIT Press: Cambridge, MA.
Watkins, C.J.C.H. 1992. Q-learning. Machine Learning, 8:279-292.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Balkenius, C., Morén, J. Dynamics of a Classical Conditioning Model. Autonomous Robots 7, 41–56 (1999). https://doi.org/10.1023/A:1008965713435
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1008965713435