Abstract
We present a motivational system for an agent undergoing reinforcement learning (RL), which enables it to balance multiple drives, each of which is satiated by different types of stimuli. Inspired by drive reduction theory, it uses Minor Component Analysis (MCA) to model the agent’s internal drive state, and modulates incoming stimuli on the basis of how strongly the stimulus satiates the currently active drive. The agent’s dynamic policy continually changes through least-squares temporal difference updates. It automatically seeks stimuli that first satiate the most active internal drives, then the next most active drives, etc. We prove that our algorithm is stable under certain conditions. Experimental results illustrate its behavior.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction, vol. 1. Cambridge Univ Press (1998)
Konidaris, G., Barto, A.: An adaptive robot motivational system. In: Nolfi, S., Baldassarre, G., Calabretta, R., Hallam, J.C.T., Marocco, D., Meyer, J.-A., Miglino, O., Parisi, D. (eds.) SAB 2006. LNCS (LNAI), vol. 4095, pp. 346–356. Springer, Heidelberg (2006)
Cos, I., Cañamero, L., Hayes, G.M., Gillies, A.: Hedonic value: Enhancing adaptation for motivated agents. Adaptive Behavior 21(6), 465–483 (2013)
Woodworth, R.S.: Dynamic psychology, by Robert Sessions Woodworth. Columbia University Press (1918)
Hull, C.L.: Principles of behavior: An introduction to behavior theory. Century psychology series. D. Appleton-Century Company, Incorporated (1943)
Wolpe, J.: Need-reduction, drive-reduction, and reinforcement: A neurophysiological view. Psychological Review 57(1), 19 (1950)
Barrett, L., Narayanan, S.: Learning all optimal policies with multiple criteria. In: Proceedings of the 25th International Conference on Machine Learning, pp. 41–47. ACM (2008)
Vamplew, P., Dazeley, R., Berry, A., Issabekov, R., Dekker, E.: Empirical evaluation methods for multiobjective reinforcement learning algorithms. Machine Learning 84(1), 51–80 (2011)
Keramati, M., Gutkin, B.S.: A reinforcement learning theory for homeostatic regulation. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P., Pereira, F.C.N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 24, pp. 82–90 (2011)
Konidaris, G.D., Hayes, G.M.: An architecture for behavior-based reinforcement learning. Adaptive Behavior 13(1), 5–32 (2005)
Oja, E.: Principal components, minor components, and linear neural networks. Neural Networks 5(6), 927–935 (1992)
Peng, D., Yi, Z., Luo, W.: Convergence analysis of a simple minor component analysis algorithm. Neural Networks 20(7), 842–850 (2007)
White, R.W.: Motivation reconsidered: The concept of competence. Psychological Review 66(5), 297 (1959)
Luciw, M., Kompella, V.R., Kazerounian, S., Schmidhuber, J.: An intrinsic value system for developing multiple invariant representations with incremental slowness learning. Frontiers in Neurorobotics 7 (2013)
Shirinov, E., Butz, M.V.: Distinction between types of motivations: Emergent behavior with a neural, model-based reinforcement learning system. In: IEEE Symposium on Artificial Life, ALife 2009, pp. 69–76. IEEE (2009)
Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). In: IJCAI, pp. 1445–1447 (2003)
Singh, S., Jaakkola, T., Littman, M.L., Szepesvári, C.: Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learning 38(3), 287–308 (2000)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. The Journal of Machine Learning Research 4, 1107–1149 (2003)
Mahadevan, S., Maggioni, M.: Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research 8(16), 2169–2231 (2007)
Schmidhuber, J.: Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development 2(3), 230–247 (2010)
Guedalia, I.D., London, M., Werman, M.: An on-line agglomerative clustering method for nonstationary data. Neural Computation 11(2), 521–540 (1999)
Kompella, V.R., Luciw, M., Schmidhuber, J.: Incremental slow feature analysis: Adaptive low-complexity slow feature updating from high-dimensional input streams. Neural Computation 24(11), 2994–3024 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kompella, V.R., Kazerounian, S., Schmidhuber, J. (2014). An Anti-hebbian Learning Rule to Represent Drive Motivations for Reinforcement Learning. In: del Pobil, A.P., Chinellato, E., Martinez-Martin, E., Hallam, J., Cervera, E., Morales, A. (eds) From Animals to Animats 13. SAB 2014. Lecture Notes in Computer Science(), vol 8575. Springer, Cham. https://doi.org/10.1007/978-3-319-08864-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-08864-8_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08863-1
Online ISBN: 978-3-319-08864-8
eBook Packages: Computer ScienceComputer Science (R0)