Abstract
Many algorithms have been proposed so far for pruning and sparse approximation of feedforward neural networks with random weights in order to obtain compact networks which are fast and robust on various datasets. One drawback of the randomization process is that the resulted weight vectors might be highly correlated. It has been shown that ensemble classifiers’ error depends on the amount of error correlation between them. Thus, decrease in correlation between output vectors must lead to generation of more efficient hidden nodes. In this research a new learning algorithm called New Sparse Learning Machine (NSLM) for single-hidden layer feedforward networks is proposed for regression and classification. In the first phase, the algorithm creates hidden layer with small correlation among nodes by orthogonalizing the columns of the output matrix. Then in the second phase, using \(L_1\)-norm minimization problem, NSLM makes the components of the solution vector become zero as many as possible. The resulted network has higher degree of sparsity while the accuracy is maintained or improved. Therefore, the method leads to a new network with a better generalization performance. Numerical comparisons on several classification and regression datasets confirm the expected improvement in comparison to the basic network.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257. doi:10.1016/0893-6080(91)90009-T
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6:861–867. doi:10.1016/S0893-6080(05)80131-5
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6:1320–1329. doi:10.1109/72.471375
Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3:246–257. doi:10.1162/neco.1991.3.2.246
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17:879–892. doi:10.1109/TNN.2006.875977
Huang GB, Babri H (1998) Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Trans Neural Netw 9:224–229. doi:10.1109/72.655045
Anders U, Korn O (1999) Model selection in neural networks. Neural Netw 12:309–323. doi:10.1016/S0893-6080(98)00117-8
Sietsma J, Dow RJ (1988) Neural net pruning-why and how. In: IEEE international conference on neural networks, pp 325–333. doi:10.1109/ICNN.1988.23864
Lazarevic A, Obradovic Z (2001) Effective pruning of neural network classifier ensembles. In: Proceedings of IJCNN’01 IEEE international joint conference on neural networks, vol 2, pp 796–801. doi:10.1109/IJCNN.2001.939461
Setiono R (1997) A penalty-function approach for pruning feedforward neural networks. Neural Comput 9:185–204. doi:10.1162/neco.1997.9.1.185
Setiono R (1996) Extracting rules from pruned neural networks for breast cancer diagnosis. Artif Intell Med 8:37–51. doi:10.1016/0933-3657(95)00019-4
Huang GB, Saratchandran P, Sundararajan N (2005) A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Trans Neural Netw 16:57–67. doi:10.1109/TNN.2004.836241
Rong HJ, Ong YS, Tan AH, Zhu Z (2008) A fast pruned-extreme learning machine for classification problem. Neurocomputing 72:359–366. doi:10.1016/j.neucom.2008.01.005
Alcin OF, Sengur A, Ghofrani S, Ince MC (2014) GA-SELM Greedy algorithms for sparse extreme learning machine. Measurement 55:126–132. doi:10.1016/j.measurement.2014.04.012
Balasundaram S, Gupta D (2014) 1-Norm extreme learning machine for regression and multiclass classification using Newton method. Neurocomputing 128:4–14. doi:10.1016/j.neucom.2013.03.051
Sakar A, Mammone RJ (1993) Growing and pruning neural tree networks. IEEE Trans Comput 42:291–299. doi:10.1109/12.210172
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6:5–20. doi:10.1016/j.inffus.2004.04.004
Hsu KW, Srivastava J (2010) Relationship between diversity and correlation in multi-classifier systems. In: Zaki MJ, Yu JX, Ravindran B, Pudi V (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 500–506. doi:10.1007/978-3-642-13672-647
Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8:385–404. doi:10.1080/095400996116839
Lichman M (2013) UCI Machine learning repository. University of California Irvine, CA. http://archive.ics.uci.edu/ml. Accessed 22 March 2016
Bertsekas DP (1999) Nonlinear programming, 2nd edn. Athena Scientific, Cambridge. ISBN: 1-886529-00-0
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York
Cohen H (1993) A course in computational algebraic number theory. Springer-Verlag New York, Inc., New York
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501. doi:10.1016/j.neucom.2005.12.126
Acknowledgements
The authors would like to thank the anonymous referee for the constructive comments and suggestions that help improve the quality of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nayyeri, M., Maskooki, A. & Monsefi, R. A New Sparse Learning Machine. Neural Process Lett 46, 15–28 (2017). https://doi.org/10.1007/s11063-016-9566-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-016-9566-2