Abstract
This paper investigates the use of the scaled conjugate gradient (SCG) algorithm in temporal-difference (TD) learning for time series prediction. Special emphasis is given on the implementation details, after examining the theoretical background of the algorithm and the learning methodology and how these could be combined. Simple time series (linear, sinusoidal, etc.) as well as more complex ones, coming from real data, are used to examine the behavior of this novel combination of learning algorithm and methodology. Preliminary experimental results indicate that the implementation as presented in this paper indeed works, but the performance (in terms of learning speed and generalization ability) of TD learning using the SCG algorithm is not as good as expected, at least on the representative problems examined. An attempt to rationalize these results is presented.
Similar content being viewed by others
References
E. Barnard (1992) ArticleTitleOptimization for training neural nets IEEE Transactions on Neural Networks 3 232–240
D. P. Bertsekas (1995) ArticleTitleA counterexample to temporal differences learning Neural Computation 7 270–279 Occurrence Handle1420870
G. E. P. Box G. M. Jenkins (1976) Time Series Analysis Forecasting and Control, Revised Edition Prentice-Hall Englewood, Cliffs, NJ
C. Charalambous (1992) ArticleTitleConjugate gradient algorithm for efficient training of artificial neural networks IEE Proceedings – G 139 301–310
Falas T. and Stafylopatis, A. G.: The impact of the error function selection in neural network-based classifiers, In: Proceedings of the International Joint Conference on Neural Networks, 3: 1799–1804, Washington, DC: IEEE Press (1999).
D. S. Johansson F. U. Dowla D. M. Goodman (1991) ArticleTitleBack-propagation learning for multi-layer feed-forward neural networks using the conjugate gradient method International Journal of Neural Systems 2 291–302 Occurrence Handle10.1142/S0129065791000261
M. F. Møller (1993) ArticleTitleA scaled conjugate gradient algorithm for fast supervised learning Neural Networks 6 525–533
D. E. Rumelhart G. E. Hinton R. J. Williams (1986) Learning internal representations by error propagation D. E. Rumelhart J. L. McClelland (Eds) Parallel Distributed Processing: Explorations in the Microstructure of Cognition MIT Press MA
R. S. Sutton (1988) ArticleTitleLearning to predict by the methods of temporal differences Machine Learning 3 9–44
Sutton, R. S.: Implementation details of the TD(λ) procedure for the case of vector predictions and backpropagation, GTE laboratories Technical Note TN87-509.1, ftp://ftp.gte.com/pub/reinforcement-learning/sutton-TD-backprop.ps, 1989.
R. S. Sutton A. G. Barto (1998) Reinforcement Learning: An Introduction The MIT Press Cambridge, MA
G. Tesauro (1992) ArticleTitlePractical issues in temporal difference learning Machine Learning 8 257–277 Occurrence Handle0772.68075
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Falas, T., Stafylopatis, A. Implementing Temporal-Difference Learning with the Scaled Conjugate Gradient Algorithm. Neural Process Lett 22, 361–375 (2005). https://doi.org/10.1007/s11063-005-1384-x
Issue Date:
DOI: https://doi.org/10.1007/s11063-005-1384-x