Skip to main content

Global Versus Local Constructive Function Approximation for On-Line Reinforcement Learning

  • Conference paper
AI 2005: Advances in Artificial Intelligence (AI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3809))

Included in the following conference series:

Abstract

In order to scale to large state-spaces, reinforcement learning (RL) algorithms need to apply function approximation techniques. Research on function approximation for RL has so far focused either on global methods with a static structure or on constructive architectures using locally responsive units. The former, whilst achieving some notable successes, has also failed on some relatively simple tasks. The locally constructive approach is more stable, but may scale poorly to higher-dimensional inputs. This paper examines two globally constructive algorithms based on the Cascor supervised-learning algorithm. These algorithms are applied within the sarsa RL algorithm, and their performance compared against a multi-layer perceptron and a locally constructive algorithm (the Resource Allocating Network). It is shown that the globally constructive algorithms are less stable, but that on some tasks they achieve similar performance to the RAN, whilst generating more compact solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)

    Google Scholar 

  2. Crites, R.H., Barto, A.G.: Improving Elevator Performance Using Reinforcement Learning, NIPS-8 (1996)

    Google Scholar 

  3. Tesauro, G.J.: Temporal difference learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)

    Article  Google Scholar 

  4. Sutton, R.S.: Generalisation in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems: Proceedings of the 1995 Conference, pp. 1038–1044. The MIT Press, Cambridge (1996)

    Google Scholar 

  5. Kretchmar, R.M., Anderson, C.W.: Comparison of CMACs and RBFs for local function approximators in reinforcement learning. IEEE International Conference on Neural Networks (1997)

    Google Scholar 

  6. Coulom, R.: Feedforward Neural Networks in Reinforcement Learning Applied to High-dimensional Motor Control. In: ALT 2002. Springer, Heidelberg (2002)

    Google Scholar 

  7. Thrun, S., Schwartz, A.: Issues in Using Function Approximation for Reinforcement Learning. In: Proceedings of the Fourth Connectionist Models Summer School, Hillsdale, NJ (December 1993)

    Google Scholar 

  8. Anderson, C.W.: Q-learning with hidden unit restarting. Advances in Neural Information Processing Systems (1993)

    Google Scholar 

  9. Ratitch, B., Precup, D.: Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning. In: ECML (2004)

    Google Scholar 

  10. Fahlman, S.E., Lebiere, C.: The Cascade-Correlation Learning Architecture. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing II. Morgan Kauffman, San Francisco (1990)

    Google Scholar 

  11. Waugh, S.G.: Extending and benchmarking Cascade-Correlation, PhD thesis, Department of Computer Science, University of Tasmania (1995)

    Google Scholar 

  12. Rivest, F., Precup, D.: Combining TD-learning with Cascade-correlation Networks. In: Twentieth International Conference on Machine Learning, Washington DC (2003)

    Google Scholar 

  13. Bellemare, M.G., Precup, D., Rivest, F.: Reinforcement Learning Using Cascade-Correlation Neural Networks, Technical Report RL-3.04, McGill University (2004)

    Google Scholar 

  14. Platt, J.: A Resource-Allocating Network for Function Interpolation. Neural Computation 3, 213–225 (1991)

    Article  MathSciNet  Google Scholar 

  15. Rummery, G., Niranjan, M.: On-line Q-Learning Using Connectionist Systems. Cambridge, Cambridge University Engineering Department (1994)

    Google Scholar 

  16. Adams, A., Waugh, S.: Function Evaluation and the Cascade-Correlation Architecture. In: IEEE International Conference on Neural Networks, pp. 942–946 (1995)

    Google Scholar 

  17. Hwang, J.-H., You, S.-S., et al.: The Cascade-Correlation Learning: A Projection Pursuit Learning Perspective. IEEE Transactions on Neural Networks 7(2), 278–288 (1996)

    Article  Google Scholar 

  18. Prechelt, L.: Investigation of the CasCor Family of Learning Algorithms. Neural Networks 10(5), 885–896 (1997)

    Article  Google Scholar 

  19. Lahnajarvi, J.J.T., Lehtokangas, M.I., Saarinen, J.P.P.: Evaluation of constructive neural networks with cascaded architectures. Neurocomputing 48, 573–607 (2002)

    Article  Google Scholar 

  20. Fahlman, S.E.: Faster-Learning Variations on Back-Propagation: An Empirical Study. In: Proceedings of the 1988 Connectionist Models Summer School (1988)

    Google Scholar 

  21. Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function, NIPS-7 (1995)

    Google Scholar 

  22. Adams, A., Vamplew, P.: Encoding and Decoding Cyclic Data. The South Pacific Journal of Natural Science 16, 54–58 (1998)

    Google Scholar 

  23. Sutton, R., Barto, S.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vamplew, P., Ollington, R. (2005). Global Versus Local Constructive Function Approximation for On-Line Reinforcement Learning. In: Zhang, S., Jarvis, R. (eds) AI 2005: Advances in Artificial Intelligence. AI 2005. Lecture Notes in Computer Science(), vol 3809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11589990_14

Download citation

  • DOI: https://doi.org/10.1007/11589990_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30462-3

  • Online ISBN: 978-3-540-31652-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics