Deep Feedback Learning

Porr, Bernd; Miller, Paul

doi:10.1007/978-3-319-97628-0_16

Bernd Porr¹⁸ &
Paul Miller¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10994))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

855 Accesses

Abstract

An agent acting in an environment aims to minimise uncertainties so that being attacked can be predicted, and rewards are not only found by chance. These events define an error signal which can be used to improve performance. In this paper we present a new algorithm where an error signal from a reflex trains a novel deep network: the error is propagated forwards through the network from its input to its output, in order to generate pro-active actions. We demonstrate the algorithm in two scenarios: a 1st-person shooter game and a driving car scenario, where in both cases the network develops strategies to become pro-active.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bennett, M.: The concept of long term potentiation of transmission at synapses. Prog. Neuriobiol. 60, 109–137 (2000)
Article Google Scholar
Canolty, R.T., Knight, R.T.: The functional role of cross-frequency coupling. Trends Cogn. Sci. 14(11), 506–515 (2010). http://www.ncbi.nlm.nih.gov/pubmed/20932795
Article Google Scholar
Grüsser, O.: Interaction of efferent and afferent signals in visual perception. A history of ideas and experimental paradigms. Acta Psychol. 63, 3–21 (1986)
Article Google Scholar
Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016). http://www.ncbi.nlm.nih.gov/pubmed/27824044
Article Google Scholar
Lindsay, G.W., Rigotti, M., Warden, M.R., Miller, E.K., Fusi, S.: Hebbian learning in a random network captures selectivity properties of the prefrontal cortex. J. Neurosci. Off. J, Soc. Neurosci. 37(45), 11021–11036 (2017). http://www.ncbi.nlm.nih.gov/pubmed/28986463
Article Google Scholar
Malenka, R.C., Nicoll, R.A.: Long-term potentiation – a decade of progress? Science 285, 1870–1874 (1999)
Article Google Scholar
Meunier, C.N.J., Chameau, P., Fossier, P.M.: Modulation of synaptic plasticity in the cortex needs to understand all the players. Front. Synaptic Neurosci. 9, 2 (2017). http://www.ncbi.nlm.nih.gov/pubmed/28203201
Article Google Scholar
Mulkey, R.M., Malenka, R.C.: Mechanisms underlying induction of homosynaptic long-term depression in area ca1 of the hippocampus. Neuron 9(5), 967–975 (1992). http://www.ncbi.nlm.nih.gov/pubmed/1419003
Article Google Scholar
Phillips, C.L.: Feedback Control Systems. Prentice-Hall International, London (2000)
MATH Google Scholar
Porr, B., von Ferber, C., Wörgötter, F.: ISO-learning approximates a solution to the inverse-controller problem in an unsupervised behavioural paradigm. Neural Comput. 15, 865–884 (2003)
Article Google Scholar
Porr, B., Wörgötter, F.: Isotropic sequence order learning. Neural Comput. 15, 831–864 (2003)
Article Google Scholar
Porr, B., Wörgötter, F.: What means embodiment for radical constructivists? Kybernetes, pp. 105–117 (2005)
Google Scholar
Roelfsema, P.R., Holtmaat, A.: Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19(3), 166–180 (2018). http://www.ncbi.nlm.nih.gov/pubmed/29449713
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. Bradford Books, MIT Press, Cambridge (1998)
Google Scholar
Tejomurtula, S., Kak, S.: Inverse kinematics in robotics using neural networks. Inf. Sci. 116, 147–164 (1999)
Article MathSciNet Google Scholar
von Uexküll, B.J.J.: Theoretical Biology. Kegan Paul, Trubner (1926)
Google Scholar
Verschure, P., Coolen, A.: Adaptive fields: distributed representations of classically conditioned associations. Network 2, 189–206 (1991)
Article Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
MATH Google Scholar
Wörgötter, F., Porr, B.: Temporal sequence learning, prediction and control - a review of different models and their relation to biological mechanisms. Neural Comput. 17, 245–319 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Glasgow Neuro LTD, Glasgow, UK
Bernd Porr & Paul Miller

Authors

Bernd Porr
View author publications
You can also search for this author in PubMed Google Scholar
Paul Miller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernd Porr .

Editor information

Editors and Affiliations

University of Southern Denmark, Odense, Denmark
Poramate Manoonpong
University of Southern Denmark, Odense, Denmark
Jørgen Christian Larsen
University of Southern Denmark, Odense, Denmark
Xiaofeng Xiong
University of Southern Denmark, Odense, Denmark
John Hallam
Frankfurt Institute for Advanced Studies, Frankfurt/Main, Germany
Jochen Triesch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Porr, B., Miller, P. (2018). Deep Feedback Learning. In: Manoonpong, P., Larsen, J., Xiong, X., Hallam, J., Triesch, J. (eds) From Animals to Animats 15. SAB 2018. Lecture Notes in Computer Science(), vol 10994. Springer, Cham. https://doi.org/10.1007/978-3-319-97628-0_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-97628-0_16
Published: 26 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97627-3
Online ISBN: 978-3-319-97628-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics