Neuro-Resistive Grid approach to trainable controllers: A pole balancing example

Bapi, Raju S.; D'Cruz, Brendan; Bugmann, Guido

doi:10.1007/BF01414101

Neuro-Resistive Grid approach to trainable controllers: A pole balancing example

Articles
Published: March 1997

Volume 5, pages 33–44, (1997)
Cite this article

Neural Computing & Applications Aims and scope Submit manuscript

Raju S. Bapi¹,
Brendan D'Cruz¹ &
Guido Bugmann¹

73 Accesses
5 Citations
Explore all metrics

Abstract

A new neural network approach is described for the task of pole-balancing, considered a benchmark learning control problem. This approach combines Barto, Sutton and Anderson's [1] Associative Search Element (ASE) with a Neuro-Resistive Grid (NRG) [2] acting as Adaptive Critic Element (ACE). The novel feature in NRG is that it provides evaluation of a state based on propagation of the failure information to the neighbours in the grid. NRG is updated only on a failure, and provides ASE with a continuous internal reinforcement signal by comparing the value of the present state to the previous state. The resulting system learns more rapidly and with fewer computations than that of Barto et al.[1]. To establish a uniform basis of comparison of algorithms for pole balancing, both the systems are simulated using benchmark parameters and tests specified in Geva and Sitte [3].

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Barto AG, Sutton RS, Anderson CW. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst, Man & Cybern 1983; 13: 834–846
Google Scholar
Bugmann G, Taylor JG, Denham MJ. Route finding by neural nets. In: Taylor JG (ed), Neural Networks, Unicom & Alfred Waller, UK, 1995, 217–231
Google Scholar
Geva S, Sitte J. A Cartpole experiment benchmark for trainable controllers. IEEE Control Systems Magazine 1993; 13: 40–51
Google Scholar
Rosen BE, Goodwin JM, Vidal JJ. Process control with adaptive range coding. Biol Cybern 1992; 66: 419–428
Google Scholar
Sutton RS. Learning to predict by the method of temporal differences. Machine Learning 1988; 3: 9–44
Google Scholar
Barto AG, Sutton RS, Watkins CJCH. Learning and sequential decision making. In: Gabriel M, Moore J. (ed.), Learning and Computational Neuroscience: Foundation of Adaptive Networks, MIT Press, Cambridge, MA, 1990, 539–602
Google Scholar
Barto AG, Bradtke SJ, Singh SP. Learning to act using real-time dynamic programming. Artificial Intelligence 1995; 72: 81–138
Google Scholar
Ribeiro CHC. Attentional mechanism as a strategy for generalisation in the Q-learning algorithm. In: Fogelman-Soulié F, Gallinari P. (ed.), Proc. ICANN '95, Paris, 1995; 1: 455–460
Connolly CI, Burns JB, Weiss R. Path planning using Laplace's equation. Proc IEEE Int Conf Robotics & Automation 1990; 2102–2106
Tarassenko L, Blake A. Analogue computation of collision-free paths. Proc IEEE Int Conf on Robotics & Automation, Sacramento, CA, 1991, 540–545
Sutton RS, Pinette B. The learning of world models by connectionist networks. Proc Seventh Ann Conf of the Cog Sci Soc, Lawrence Erlbaum, 1985, 54–64
Moore AW. Efficient memory-based learning for robot control. PhD thesis, University of Cambridge, 1990
Tesauro G. Temporal difference learning and TD-Gammon. Comm ACM 1995; 38(3): 58–68
Google Scholar
Prokhorov DV, Santiago RA, Wunsch II DC. Adaptive critic designs: A case study for neurocontrol. Neural Networks 1995; 8(9): 1367–1372
Google Scholar

Download references

Author information

Authors and Affiliations

Neurodynamics Research Group, School of Computing, University of Plymouth, PL4 8AA, Plymouth, UK
Raju S. Bapi, Brendan D'Cruz & Guido Bugmann

Authors

Raju S. Bapi
View author publications
You can also search for this author in PubMed Google Scholar
Brendan D'Cruz
View author publications
You can also search for this author in PubMed Google Scholar
Guido Bugmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raju S. Bapi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bapi, R.S., D'Cruz, B. & Bugmann, G. Neuro-Resistive Grid approach to trainable controllers: A pole balancing example. Neural Comput & Applic 5, 33–44 (1997). https://doi.org/10.1007/BF01414101

Download citation

Issue Date: March 1997
DOI: https://doi.org/10.1007/BF01414101

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neuro-Resistive Grid approach to trainable controllers: A pole balancing example

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

A practical guide to multi-objective reinforcement learning and planning

A Review of Physics Informed Neural Networks for Multiscale Analysis and Inverse Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Neuro-Resistive Grid approach to trainable controllers: A pole balancing example

Abstract

Access this article

Similar content being viewed by others

Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next

A practical guide to multi-objective reinforcement learning and planning

A Review of Physics Informed Neural Networks for Multiscale Analysis and Inverse Problems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation