Abstract
High order neural networks (HONNs) are neural networks which employ neurons that combine their inputs non-linearly. The high order network with exponential synaptic links (HONEST) network is a HONN that uses neurons with product units and adaptable exponents. This study examines the use of several advanced learning methods to train the HONEST network: resilient propagation, conjugate gradient, scaled conjugate gradient (SCG), and the Levenberg–Marquardt method. Using a collection of 32 widely-used benchmark datasets, we compare the mean squared error (MSE) performance of the HONEST network across the four algorithms, in addition to backpropagation, and find the SCG method to produce the best performance to a statistically significant extent. Additionally, we investigate the use of a regularization term in the error function, to smooth the magnitudes of the network exponents and nudge the network towards smaller exponents. We find that the use of regularization reduces exponent magnitudes without compromising test set MSE performance.
Similar content being viewed by others
References
Rumelhart DE, Hinton GE, McClelland JL (1986) A general framework for parallel distributed processing. Parallel Distrib Process 1(2):45–76
Shin Y, Ghosh J (1991) The pi–sigma network: an efficient higher-order neural network for pattern classification and function approximation. In: Proceedings of the international joint conference on neural networks, IJCNN-91, vol 1, pp 13–18
Durbin R, Rumelhart DE (1989) Product units: a computationally powerful and biologically plausible extension to backpropagation networks. Neural Comput 1(1):133–142
Fallahnezhad M, Moradi MH, Zaferanlouei S (2011) A hybrid higher order neural classifier for handling classification problems. Expert Syst Appl 38(1):386–393
Abdelbar AM, Tagliarini GA (1996) HONEST: a new high order feedforward neural network. In: IEEE international conference on neural networks, vol 2, pp 1257–1262
Narayan S (1993) ExpoNet: A generalization of the multi-layer perceptron model. In: Proceedings of the World Congress on Neural Networks, vol 3, pp 494–497
Rovithakis G, Chalkiadakis I, Zervakis M (2004) High-order neural network structure selection for function approximation applications using genetic algorithms. IEEE Trans Syst Man Cybern B Cybern 34(1):150–158
Martínez-Estudillo AC, Martínez-Estudillo FJ, Herváz-Martínez C, García-Pedrajas N (2006) Evolutionary product unit based neural networks for regression. Neural Netw 19(4):477–486
Giles CL, Maxwell T (1987) Learning, invariance, and generalization in high-order neural networks. Appl Opt 26(23):4972–4978
Abdelbar AM (1998) Achieving superior generalisation with a high order neural network. Neural Comput Appl 7(2):141–146
Abdelbar AM, Attia S, Tagliarini GA (2002) A hybridization of Bayesian and neural learning. Neurocomputing 48(1):443–453
Tsai HC (2009) Hybrid high order neural networks. Appl Soft Comput 9(3):874–881
Tsai HC (2010) Predicting strengths of concrete-type specimens using hybrid multilayer perceptrons with center-unified particle swarm optimization. Expert Syst Appl 37(2):1104–1112
Van Den Bergh F, Engelbrecht AP (2001) Training product unit networks using cooperative particle swarm optimisers. In: Proceedings of the international joint conference on neural networks, IJCNN-01, vol 1, pp 126–131
Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: IEEE international conference on neural networks, pp 586–591
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49(6):409–436
Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6(4):525–533
Levenberg K (1944) A method for the solution of certain problems in least squares. Q Appl Math 2:164–168
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11(2):431–441
Hagan MT, Menhaj MB (1994) Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993
Suratgar AA, Tavakoli MB, Hoseinabadi A (2005) Modified Levenberg–Marquardt method for neural networks training. World Acad Sci Eng Technol 6:46–48
Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 5:1035
Tsai HC, Wu YW, Tyan YY (2013) Programming squat wall strengths and tuning associated codes with pruned modular neural network. Neural Comput Appl 23(3–4):741–749
Ivakhnenko A (1971) Polynomial theory of complex systems. IEEE Trans Syst Man Cybern 1(4):364–378
Puig V, Witczak M, Nejjari F, Quevedo J, Korbicz J (2007) A GMDH neural network-based approach to passive robust fault detection using a constraint satisfaction backward test. Eng Appl Artif Intell 20(7):886–897
Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. PhD thesis, Committee on Applied Math, Harvard University, Cambridge, MA
Werbos PJ (1994) The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. Wiley-Interscience, New York
Chauvin Y, Rumelhart DE (1995) Backpropagation: theory, architectures, and applications. Lawrence Erlbaum, Hillsdale
Werbos P (1968) The elements of intelligence. Cybernetica (Namur) 3:131–178
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560
Fletcher R (1987) Practical methods of optimization, 2nd edn. Wiley, New York
Andrei N (2007) Scaled conjugate gradient algorithms for unconstrained optimization. Comput Optim Appl 38:401–416
Falas T, Stafylopatis A (2005) Implementing temporal-difference learning with the scaled conjugate gradient algorithm. Neural Process Lett 22:361–375
Lundén J, Koivunen V (2007) Scaled conjugate gradient method for radar pulse modulation estimation. In: IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 297–300
Mehrotra K, Mohan CK, Ranka S (1996) Elements of artificial neural networks. MIT press, Cambridge
Haykin SS (2009) Neural networks and learning machines. Prentice Hall, Upper Saddle River
Frank A, Asuncion A (2012) UCI machine learning repository. http://archive.ics.uci.edu/ml/
Pearson K (1901) Principal components analysis. Lond Edinb Dublin Philos Mag J 6(2):566
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mac Learn Res 7:1–30
García S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9(66):2677–2694
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87(23):9193–9196
Yeh IC (1998) Modeling of strength of high-performance concrete using artificial neural networks. Cem Concr Res 28(12):1797–1808
Tsanas A, Xifara A (2012) Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build 49:560–567
Gil D, Girela JL, De Juan J, Jose Gomez-Torres M, Johnsson M (2012) Predicting seminal quality with artificial intelligence methods. Expert Syst Appl 39(16):12,564–12,573
Elter M, Schulz-Wendtland R, Wittenberg T (2007) The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys 34(11):4164–4172
Hwang SJ, Fang WH, Lee HJ, Yu HW (2001) Analytical model for predicting shear strength of squat walls. J Struct Eng 127(1):43–50
Yeh IC, Yang KJ, Ting TM (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3):5866–5871
Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547–553
Acknowledgments
The partial support of a Brandon University Research Grant is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
El-Nabarawy, I., Abdelbar, A.M. Advanced learning methods and exponent regularization applied to a high order neural network. Neural Comput & Applic 25, 897–910 (2014). https://doi.org/10.1007/s00521-014-1563-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-014-1563-7