Advanced learning methods and exponent regularization applied to a high order neural network

El-Nabarawy, Islam; Abdelbar, Ashraf M.

doi:10.1007/s00521-014-1563-7

Advanced learning methods and exponent regularization applied to a high order neural network

Original Article
Published: 03 April 2014

Volume 25, pages 897–910, (2014)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Islam El-Nabarawy¹ &
Ashraf M. Abdelbar²

247 Accesses
5 Citations
Explore all metrics

Abstract

High order neural networks (HONNs) are neural networks which employ neurons that combine their inputs non-linearly. The high order network with exponential synaptic links (HONEST) network is a HONN that uses neurons with product units and adaptable exponents. This study examines the use of several advanced learning methods to train the HONEST network: resilient propagation, conjugate gradient, scaled conjugate gradient (SCG), and the Levenberg–Marquardt method. Using a collection of 32 widely-used benchmark datasets, we compare the mean squared error (MSE) performance of the HONEST network across the four algorithms, in addition to backpropagation, and find the SCG method to produce the best performance to a statistically significant extent. Additionally, we investigate the use of a regularization term in the error function, to smooth the magnitudes of the network exponents and nudge the network towards smaller exponents. We find that the use of regularization reduces exponent magnitudes without compromising test set MSE performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fundamentals of Artificial Neural Networks and Deep Learning

Comparing Regularization Techniques Applied to a Perceptron

On the antiderivatives of xp/(1 − x) with an application to optimize loss functions for classification with neural networks

Article Open access 18 March 2022

References

Rumelhart DE, Hinton GE, McClelland JL (1986) A general framework for parallel distributed processing. Parallel Distrib Process 1(2):45–76
Google Scholar
Shin Y, Ghosh J (1991) The pi–sigma network: an efficient higher-order neural network for pattern classification and function approximation. In: Proceedings of the international joint conference on neural networks, IJCNN-91, vol 1, pp 13–18
Durbin R, Rumelhart DE (1989) Product units: a computationally powerful and biologically plausible extension to backpropagation networks. Neural Comput 1(1):133–142
Article Google Scholar
Fallahnezhad M, Moradi MH, Zaferanlouei S (2011) A hybrid higher order neural classifier for handling classification problems. Expert Syst Appl 38(1):386–393
Article Google Scholar
Abdelbar AM, Tagliarini GA (1996) HONEST: a new high order feedforward neural network. In: IEEE international conference on neural networks, vol 2, pp 1257–1262
Narayan S (1993) ExpoNet: A generalization of the multi-layer perceptron model. In: Proceedings of the World Congress on Neural Networks, vol 3, pp 494–497
Rovithakis G, Chalkiadakis I, Zervakis M (2004) High-order neural network structure selection for function approximation applications using genetic algorithms. IEEE Trans Syst Man Cybern B Cybern 34(1):150–158
Article Google Scholar
Martínez-Estudillo AC, Martínez-Estudillo FJ, Herváz-Martínez C, García-Pedrajas N (2006) Evolutionary product unit based neural networks for regression. Neural Netw 19(4):477–486
Article MATH Google Scholar
Giles CL, Maxwell T (1987) Learning, invariance, and generalization in high-order neural networks. Appl Opt 26(23):4972–4978
Article Google Scholar
Abdelbar AM (1998) Achieving superior generalisation with a high order neural network. Neural Comput Appl 7(2):141–146
Article MATH Google Scholar
Abdelbar AM, Attia S, Tagliarini GA (2002) A hybridization of Bayesian and neural learning. Neurocomputing 48(1):443–453
Article MATH Google Scholar
Tsai HC (2009) Hybrid high order neural networks. Appl Soft Comput 9(3):874–881
Article Google Scholar
Tsai HC (2010) Predicting strengths of concrete-type specimens using hybrid multilayer perceptrons with center-unified particle swarm optimization. Expert Syst Appl 37(2):1104–1112
Article Google Scholar
Van Den Bergh F, Engelbrecht AP (2001) Training product unit networks using cooperative particle swarm optimisers. In: Proceedings of the international joint conference on neural networks, IJCNN-01, vol 1, pp 126–131
Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: IEEE international conference on neural networks, pp 586–591
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49(6):409–436
Article MATH MathSciNet Google Scholar
Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6(4):525–533
Article Google Scholar
Levenberg K (1944) A method for the solution of certain problems in least squares. Q Appl Math 2:164–168
MATH MathSciNet Google Scholar
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11(2):431–441
Article MATH MathSciNet Google Scholar
Hagan MT, Menhaj MB (1994) Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993
Article Google Scholar
Suratgar AA, Tavakoli MB, Hoseinabadi A (2005) Modified Levenberg–Marquardt method for neural networks training. World Acad Sci Eng Technol 6:46–48
Google Scholar
Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 5:1035
MATH Google Scholar
Tsai HC, Wu YW, Tyan YY (2013) Programming squat wall strengths and tuning associated codes with pruned modular neural network. Neural Comput Appl 23(3–4):741–749
Article Google Scholar
Ivakhnenko A (1971) Polynomial theory of complex systems. IEEE Trans Syst Man Cybern 1(4):364–378
Article MathSciNet Google Scholar
Puig V, Witczak M, Nejjari F, Quevedo J, Korbicz J (2007) A GMDH neural network-based approach to passive robust fault detection using a constraint satisfaction backward test. Eng Appl Artif Intell 20(7):886–897
Article Google Scholar
Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. PhD thesis, Committee on Applied Math, Harvard University, Cambridge, MA
Werbos PJ (1994) The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. Wiley-Interscience, New York
Google Scholar
Chauvin Y, Rumelhart DE (1995) Backpropagation: theory, architectures, and applications. Lawrence Erlbaum, Hillsdale
Google Scholar
Werbos P (1968) The elements of intelligence. Cybernetica (Namur) 3:131–178
Google Scholar
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560
Article Google Scholar
Fletcher R (1987) Practical methods of optimization, 2nd edn. Wiley, New York
MATH Google Scholar
Andrei N (2007) Scaled conjugate gradient algorithms for unconstrained optimization. Comput Optim Appl 38:401–416
Article MATH MathSciNet Google Scholar
Falas T, Stafylopatis A (2005) Implementing temporal-difference learning with the scaled conjugate gradient algorithm. Neural Process Lett 22:361–375
Article Google Scholar
Lundén J, Koivunen V (2007) Scaled conjugate gradient method for radar pulse modulation estimation. In: IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 297–300
Mehrotra K, Mohan CK, Ranka S (1996) Elements of artificial neural networks. MIT press, Cambridge
Google Scholar
Haykin SS (2009) Neural networks and learning machines. Prentice Hall, Upper Saddle River
Google Scholar
Frank A, Asuncion A (2012) UCI machine learning repository. http://archive.ics.uci.edu/ml/
Pearson K (1901) Principal components analysis. Lond Edinb Dublin Philos Mag J 6(2):566
Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Article Google Scholar
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mac Learn Res 7:1–30
MATH Google Scholar
García S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9(66):2677–2694
MATH Google Scholar
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87(23):9193–9196
Article MATH Google Scholar
Yeh IC (1998) Modeling of strength of high-performance concrete using artificial neural networks. Cem Concr Res 28(12):1797–1808
Article Google Scholar
Tsanas A, Xifara A (2012) Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build 49:560–567
Article Google Scholar
Gil D, Girela JL, De Juan J, Jose Gomez-Torres M, Johnsson M (2012) Predicting seminal quality with artificial intelligence methods. Expert Syst Appl 39(16):12,564–12,573
Article Google Scholar
Elter M, Schulz-Wendtland R, Wittenberg T (2007) The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys 34(11):4164–4172
Article Google Scholar
Hwang SJ, Fang WH, Lee HJ, Yu HW (2001) Analytical model for predicting shear strength of squat walls. J Struct Eng 127(1):43–50
Article Google Scholar
Yeh IC, Yang KJ, Ting TM (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3):5866–5871
Article Google Scholar
Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547–553
Article Google Scholar

Download references

Acknowledgments

The partial support of a Brandon University Research Grant is gratefully acknowledged.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The American University in Cairo, New Cairo, Cairo, Egypt
Islam El-Nabarawy
Department of Mathematics and Computer Science, Brandon University, Brandon, MB, Canada
Ashraf M. Abdelbar

Authors

Islam El-Nabarawy
View author publications
You can also search for this author in PubMed Google Scholar
Ashraf M. Abdelbar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Islam El-Nabarawy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

El-Nabarawy, I., Abdelbar, A.M. Advanced learning methods and exponent regularization applied to a high order neural network. Neural Comput & Applic 25, 897–910 (2014). https://doi.org/10.1007/s00521-014-1563-7

Download citation

Received: 14 July 2013
Accepted: 06 March 2014
Published: 03 April 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s00521-014-1563-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advanced learning methods and exponent regularization applied to a high order neural network

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Comparing Regularization Techniques Applied to a Perceptron

On the antiderivatives of xp/(1 − x) with an application to optimize loss functions for classification with neural networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Advanced learning methods and exponent regularization applied to a high order neural network

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Comparing Regularization Techniques Applied to a Perceptron

On the antiderivatives of xp/(1 − x) with an application to optimize loss functions for classification with neural networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation