Evolution of Reinforcement Learning in Uncertain Environments: Emergence of Risk-Aversion and Matching

Niv, Yael; Joel, Daphna; Meilijson, Isaac; Ruppin, Eytan

doi:10.1007/3-540-44811-X_27

Yael Niv²,
Daphna Joel²,
Isaac Meilijson³ &
…
Eytan Ruppin³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2159))

Included in the following conference series:

European Conference on Artificial Life

1072 Accesses
3 Citations

Abstract

Reinforcement learning (RL) is a fundamental process by which organisms learn to achieve a goal from interactions with the environment. Using Artificial Life techniques we derive (near-)optimal neuronal learning rules in a simple neural network model of decision-making in simulated bumblebees foraging for nectar. The resulting networks exhibit efficient RL, allowing the bees to respond rapidly to changes in reward contingencies. The evolved synaptic plasticity dynamics give rise to varying exploration/exploitation levels from which emerge the well-documented foraging strategies of risk aversion and probability matching. These are shown to be a direct result of optimal RL, providing a biologically founded, parsimonious and novel explanation for these behaviors. Our results are corroborated by a rigorous mathematical analysis and by experiments in mobile robots.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Ackley and M. Littman. Interactions between learning and evolution. In J.D. Farmer C.G. Langton, C. Taylor and S. Rasmussen, editors, Artificial Life II. Addison-Wesley, 1991.
Google Scholar
D.J. Chalmers. The evolution of learning: An experiment in genetic connectionism. In D.S. Touretzky, J.L. Elman, T.J. Sejnowski, and G.E. Hinton, editors, Proc. of the 1990 Connectionist Models Summer School. Mogan Kaufmann, 1990.
Google Scholar
D. Floreano and F. Mondada. Evolutionary neurocontrollers for autonomous mobile robots. Neural networks, 11:1461–1478, 1998.
Article Google Scholar
J.F. Fontanari and R. Meir. Evolving a learning algorithm for the binary percep-tron. Network, 2(4):353–359, November 1991.
Google Scholar
M. Hammer. The neural basis of associative reward learning in honeybees. Trends in Neuroscience, 20(6):245–252, 1997.
Article Google Scholar
L.D. Harder and L.A. Real. Why are bumble bees risk averse? Ecology, 68(4):1104–1108, 1987.
Article Google Scholar
G.E. Hinton and S.J. Nowlan. How learning guides evolution. Complex Systems, 1:495–502, 1987.
MATH Google Scholar
A. Kacelnik and M. Bateson. Risky thoeries-the effect of variance on foraging decisions. American Zoologist, 36:402–434, 1996.
Google Scholar
T. Kaesar, E. Rashkovich, D. Cohen, and A. Shmida. Choice behavior of bees in two-armed bandit situations: Experiments and possible decision rules. Behavioral Ecology. Submitted.
Google Scholar
J. G. March. Learning to be risk averse. Psychological Review, 103(2):309–319, 1996.
Article Google Scholar
P.R. Montague, P. Dayan, C. Person, and T.J. Sejnowski. Bee foraging in uncertain environments using predictive hebbian learning. Nature, 377:725–728, 1995.
Article Google Scholar
S. Nolfi, J.L. Elman, and D. Parisi. Learning and evolution in neural networks. Adaptive Behavior, 3(1):5–28, 1994.
Article Google Scholar
L.A. Real. Animal choice behavior and the evolution of cognitive architecture. Science, 253:980–985, August 1991.
Google Scholar
A.K. Seth. Evolving behavioral choice: An investigation into Herrnstein’s matching law. In J. Nicoud D. Floreano and F. Mondada, editors, Advances in Artificial Life, 5th European Conference, ECAL’ 99, pages 225–235. Springer, 1999.
Google Scholar
R.S. Sutton and A.G. Barto. Reinforcement learning: An introduction. MIT Press, 1998.
Google Scholar
F. Thuijsman, B. Peleg, M. Amitai, and A. Shmida. Automata, matching and foraging behavior of bees. Journal of Theoretical Biology, 175:305–316, 1995.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, Tel-Aviv University, Tel-Aviv, 69978, Israel
Yael Niv & Daphna Joel
School of Mathematical Sciences, Tel-Aviv University, Tel-Aviv, 69978, Israel
Isaac Meilijson & Eytan Ruppin

Authors

Yael Niv
View author publications
You can also search for this author in PubMed Google Scholar
Daphna Joel
View author publications
You can also search for this author in PubMed Google Scholar
Isaac Meilijson
View author publications
You can also search for this author in PubMed Google Scholar
Eytan Ruppin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Philosophy and Science Institute of Computer Science, Silesian University, 74601, Opava, Czech Republic
Jozef Kelemen & Petr Sosík &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Niv, Y., Joel, D., Meilijson, I., Ruppin, E. (2001). Evolution of Reinforcement Learning in Uncertain Environments: Emergence of Risk-Aversion and Matching. In: Kelemen, J., Sosík, P. (eds) Advances in Artificial Life. ECAL 2001. Lecture Notes in Computer Science(), vol 2159. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44811-X_27

Download citation

DOI: https://doi.org/10.1007/3-540-44811-X_27
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42567-0
Online ISBN: 978-3-540-44811-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics