Skip to main content
Log in

Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

This paper investigates the use of the scaled conjugate gradient (SCG) algorithm in temporal-difference (TD) learning for time series prediction. Special emphasis is given on the implementation details, after examining the theoretical background of the algorithm and the learning methodology and how these could be combined. Simple time series (linear, sinusoidal, etc.) as well as more complex ones, coming from real data, are used to examine the behavior of this novel combination of learning algorithm and methodology. Preliminary experimental results indicate that the implementation as presented in this paper indeed works, but the performance (in terms of learning speed and generalization ability) of TD learning using the SCG algorithm is not as good as expected, at least on the representative problems examined. An attempt to rationalize these results is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Barnard (1992) ArticleTitleOptimization for training neural nets IEEE Transactions on Neural Networks 3 232–240

    Google Scholar 

  2. D. P. Bertsekas (1995) ArticleTitleA counterexample to temporal differences learning Neural Computation 7 270–279 Occurrence Handle1420870

    MathSciNet  Google Scholar 

  3. G. E. P. Box G. M. Jenkins (1976) Time Series Analysis Forecasting and Control, Revised Edition Prentice-Hall Englewood, Cliffs, NJ

    Google Scholar 

  4. C. Charalambous (1992) ArticleTitleConjugate gradient algorithm for efficient training of artificial neural networks IEE Proceedings – G 139 301–310

    Google Scholar 

  5. Falas T. and Stafylopatis, A. G.: The impact of the error function selection in neural network-based classifiers, In: Proceedings of the International Joint Conference on Neural Networks, 3: 1799–1804, Washington, DC: IEEE Press (1999).

  6. D. S. Johansson F. U. Dowla D. M. Goodman (1991) ArticleTitleBack-propagation learning for multi-layer feed-forward neural networks using the conjugate gradient method International Journal of Neural Systems 2 291–302 Occurrence Handle10.1142/S0129065791000261

    Article  Google Scholar 

  7. M. F. Møller (1993) ArticleTitleA scaled conjugate gradient algorithm for fast supervised learning Neural Networks 6 525–533

    Google Scholar 

  8. D. E. Rumelhart G. E. Hinton R. J. Williams (1986) Learning internal representations by error propagation D. E. Rumelhart J. L. McClelland (Eds) Parallel Distributed Processing: Explorations in the Microstructure of Cognition MIT Press MA

    Google Scholar 

  9. R. S. Sutton (1988) ArticleTitleLearning to predict by the methods of temporal differences Machine Learning 3 9–44

    Google Scholar 

  10. Sutton, R. S.: Implementation details of the TD(λ) procedure for the case of vector predictions and backpropagation, GTE laboratories Technical Note TN87-509.1, ftp://ftp.gte.com/pub/reinforcement-learning/sutton-TD-backprop.ps, 1989.

  11. R. S. Sutton A. G. Barto (1998) Reinforcement Learning: An Introduction The MIT Press Cambridge, MA

    Google Scholar 

  12. G. Tesauro (1992) ArticleTitlePractical issues in temporal difference learning Machine Learning 8 257–277 Occurrence Handle0772.68075

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tasos Falas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Falas, T., Stafylopatis, A. Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm. Neural Process Lett 22, 361–375 (2005). https://doi.org/10.1007/s11063-005-1384-x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-005-1384-x

Keywords

Navigation