Abstract
In this paper, we will focus on the use of the three-layer backpropagation network in vector-valued time series estimation problems. The neural network provides a framework for noncomplex calculations to solve the estimation problem, yet the search for optimal or even feasible neural networks for stochastic processes is both time consuming and uncertain. The backpropagation algorithm—written in strict ANSI C—has been implemented as a standalone support library for the genetic hybrid algorithm (GHA) running on any sequential or parallel main frame computer. In order to cope with ill-conditioned time series problems, we extended the original backpropagation algorithm to a K nearest neighbors algorithm (K-NARX), where the number K is determined genetically along with a set of key parameters. In the K-NARX algorithm, the terminal solution at instant t can be used as a starting point for the next t, which tends to stabilize the optimization process when dealing with autocorrelated time series vectors. This possibility has proved to be especially useful in difficult time series problems. Following the prevailing research directions, we use a genetic algorithm to determine optimal parameterizations for the network, including the lag structure for the nonlinear vector time series system, the net structure with one or two hidden layers and the corresponding number of nodes, type of activation function (currently the standard logistic sigmoid, a bipolar transformation, the hyperbolic tangent, an exponential function and the sine function), the type of minimization algorithm, the number K of nearest neighbors in the K-NARX procedure, the initial value of the Levenberg–Marquardt damping parameter and the value of the neural learning (stabilization) coefficient α. We have focused on a flexible structure allowing addition of, e.g., new minimization algorithms and activation functions in the future. We demonstrate the power of the genetically trimmed K-NARX algorithm on a representative data set.
Similar content being viewed by others
References
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York
Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 3:307–327
Dai Y-H, Schittkowski K (2007) A sequential quadratic programming algorithm with non-monotone line search. Department of Computer Science, University of Bayreuth, Germany
Dan H, Yamashita N, Fukushima M (2001) Convergence Properties of the inexact Levenberg–Marquardt method under local error bound conditions. Technical Report 2001–003. Department of Applied Mathematics and Physics, Kyoto University
Dennis JE, Schnabel RB (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs
Dickey D, Fuller W (1979) Distribution of the estimators for autoregressive time series with a unit root. J Am Stat Assoc 74:427–431
Engle R (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica 4(50):987–1007
Ferentinos KP (2005) Biological engineering applications of feedforward neural networks designed and parameterized by genetic algorithms. Neural Networks 18:934–950
Farmer JD, Sidorowich JJ (1987) Predicting chaotic time series. Phys Rev Lett 59(8):845–848
Fletcher R (1987) Practical methods of optimization. Wiley, New York
Funahashi K (1989) On the approximate realization of continuous mappings by neural networks. Neural Netw 2:183–192
Giroshi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7:219–269
Golub G, Loan CF Van (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, Baltimore, pp 56–57
Harp SA, Samad T, Guha A (1989) Towards the genetic synthesis of neural networks. In: Schaffer JD (ed) Proceedings of the third international conference on genetic algorithms. Morgan Kaufmann, Fairfax
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366
Lawrence C, Zhou JL, Tits A (1998) CFSQP version 2.5d. Institute for Systems Research and Electrical Engineering Department, University of Maryland, College Park
Li S, Liu Y (2006) Parameter identification approach to vibration loads based on regularizing neural networks. Int J Comput Sci Netw Secur 6(2B):29–34
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Numerical Recipes in Fortran 77, The art of scientific computing. Cambridge University Press, London. ISBN 0-521-43064-X, copyright 1986–1992
Ormoneit D, Tresp V (1998) Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates. IEEE Trans Neural Netw 4(9):639–650
Östermark R, Aaltonen J (1998) Comparing mathematical, statistical and artificial intelligence based techniques in bankruptcy prediction. Account Bus Rev 5(1):95–120
Östermark R (1999) Solving irregular econometric and mathematical optimization problems with a genetic hybrid algorithm. Comput Econ 13(2):103–115
Östermark R (2000) The forecasting performance of cartesian ARIMA search and a vector-valued state space model. Kybernetes 29(1):83–103
Östermark R (2002) Automatic detection of parsimony in heteroskedastic time series processes. Empirical tests on global asset returns with parallel geno-mathematical programming. Soft Comput 6(1):45–63
Östermark R (2003) A multipurpose parallel genetic hybrid algorithm for nonlinear nonconvex programming problems. Eur J Oper Res 152:195–214
Östermark R (2005) Portfolio management under competing representations. Kybernet Int J Syst Cybern 34(9/10):1517–1550
Petersen KB, Pedersen MS (2007) The matrix cookbook. http://matrixcookbook.com. Version september
Polak E, Ribiére G (1969) Note sur la Convergence des Méthodes de Directions Conjuguées. Revu Francaise d’Informatique et Recherche Opérationelle 16:35–43
Rumelhart DE, McClelland JL (1986) Parallel distributed processing. Explorations in the microstructure of cognition. MIT Press, Cambridge, Massachusetts, USA
Sprott JC (2003) Chaos and time-series analysis. Oxford University Press, New York. ISBN 0-19-850840-9
Verbeek M (2004) A guide to modern econometrics, 2 edn. Wiley, Chichester
Wilamowksi BM, Iplikci S, Kaynak O, Efe MÖ (2001) An algorithm for fast convergence in training neural networks. Neural networks, Proceedings IJCNN ’01, International Joint Conference 3, pp 1778–1782
Zhang J-L, Zhang X (2006) A smoothing Levenberg–Marquardt method for NCP. Appl Math Comput 178:212–228
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Östermark, R. Geno-mathematical identification of the multi-layer perceptron. Neural Comput & Applic 18, 331–344 (2009). https://doi.org/10.1007/s00521-008-0184-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-008-0184-4