Abstract
This paper presents a hardware-friendly approach for adapting the structure of a reinforcement, learning-based neurocontroller. An unsupervised clustering algorithm is used to partition the state space of a system and to adapt the size of its reinforcement module. In the wellknown inverted pendulum problem, the system has proven to be much faster than previous neurocontroller approaches. We are currently working on an implementation of the system using field-programmable logic devices.
A. Pérez-Uribe is supported by the Centre Suisse d'électronique et Microtechnique CSEM, Neuchâtel, Switzerland.
Preview
Unable to display preview. Download preview PDF.
References
A. E. Alpaydin. Neural Models of Incremental Supervised and Unsupervised Learning. PhD thesis, Swiss Federal Institute of Technology, Lausanne, 1990. These 863.
E. Alpaydin. GAL:networks that grow when they learn and shrink when they forget. Technical Report TR-91-032, Int. Computer Science Institute, Berkeley, CA, 1991.
K. Balakrishnan. Evolutionary design of neural architectures — a preliminary taxonomy and guide to literature. Technical Report CS-TR-95-01, Department of Computer Science, Iowa State University, Ames, USA, Jan. 1995.
A. Barto, R. Sutton, and C. Anderson. Pole-balancing simulator and controller. http://envy.cs.umass.edu/People/sutton/RL-software.html.
A. Barto, R. Sutton, and C. Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, 13(5):834–846, 1983.
H. Berenji and P. Khedkar. Learning and tuning Fuzzy Logic controllers through reinforcements. In IEEE-Transactions on Neural Networks, pages 724–740, September 1992.
J. del R. Millan. Rapid, safe, and incremental learning of navigation strategies. In IEEE Transactions on Systems, Man and Cybernetics, pages 408–420, June 1996.
G. Edelman. Group selection and phasic reentrant signaling: A theory of higher brain function. In The Mindful Brain: Cortical Organization and the Group-Selective Theory of Higher Brain Function, MIT Press, 1978.
P. D. Hortensius, R. D. McLeod, and H. C. Card. Parallel random number generation for VLSI systems using cellular automata. IEEE Transactions on Computers, 38(10):1466–1473, October 1989.
P. D. Hortensius, R. D. McLeod, W. Pries, D. M. Miller, and H. C. Card. Cellular automata-based pseudorandom number generators for built-in self-test. IEEE Transactions on Computer-Aided Design, 8(8):842–859, August 1989.
I-CUBE,Inc. I-CUBE. The FPID Family Data Sheet, 2.0 edition, May 1994.
B. Krose and J. van Dam. Adaptive state space quantization for reinforcement learning collision-free navigation. In Proceedings of the 1992 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1327–1332, 1992.
C.-J. Lin and C.-T. Lin. Reinforcement learning for an ART-based Fuzzy adaptive learning control network. IEEE Transactions on Neural Networks, 7(3):709–731, 1996.
Y. Liu and X. Yao. Evolutionary design of artificial neural networks with different nodes. In Proceedings of IEEE Third International Conference on Evolutionary Computation (ICEC'96), pages 670–675, 1996.
J. M. Moreno. VLSI Architectures for Evolutive Neural Models. PhD thesis, Universitat Politecnica de Catalunya, Barcelona, 1994.
D. E. Moriarty and R. Miikkulainen. Efficient reinforcement learning through symbiotic evolution. In Machine Learning, volume 22, Kluwer Academic Publishers, 11–33 (1996).
E. Mosanya, M. Goeke, J. Linder, J.-Y. Perrier, F. Rampogna, and E. Sanchez. A platform for co-design and co-synthesis based on FPGA. In Proceedings of the 7th IEEE International Workshop on Rapid System Prototyping, pages 11–16, 1996.
S. Nolfi, D. Parisi, and J. L. Elman. Learning and evolution in neural networks. Adaptive Behavior, 3(1):5–28, 1994.
A. Perez and E. Sanchez. FPGA implementation of an adaptable-size neural network. In Proceedings of the International Conference on Artificial Neural Networks ICANN96, Springer Verlag, July 1996.
A. Perez and E. Sanchez. Neural networks structure optimization through on-line hardware evolution. In Proceedings of the World Congress on Neural Networks WCNN96, INNS Press, September 1996.
G. Schram, B. Krose, R. Babuska, and A. Krijgsman. Neurocontrol by reinforcement learning. Journal A (Journal on Automatic Control), 37(3):59–64, 1996.
S. M. Trimberger. Field-Programmable Gate Array Technology. Kluwer Academic Publishers, Boston, 1994.
B. Widrow and F. Smith. Pattern-recognizing control systems. In Proceedings of the 1963 Computer and Information Sciences (COINS) Symposium, pages 288–317, Washington D.C, 1964.
X. Yao. Evolutionary artificial neural networks. International Journal of Neural Systems, 4(3):203–222, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pérez-Uribe, A., Sanchez, E. (1997). Structure-adaptable neurocontrollers: A hardware-friendly approach. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032585
Download citation
DOI: https://doi.org/10.1007/BFb0032585
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive