Abstract
Extreme learning machine (ELM) is a machine learning technique with simple structure, fast learning speed, and excellent generalization ability, which has received a lot of attention since it was proposed. In order to further improve the sparsity of output weights and the robustness of the model, this paper proposes a sparse and robust ELM based on zero-norm regularization and a non-convex quadratic loss function. The zero-norm regularization obtains sparse hidden nodes automatically, and the introduced non-convex quadratic loss function enhances the robustness by setting constant penalties to outliers. The optimization problem can be formulated as the difference of convex functions (DC) programming. This DC programming is solved by using the DC algorithm (DCA) in this paper. The experiments on the artificial and Benchmark datasets verify that the proposed method has promising robustness while reducing the number of hidden nodes, especially on the datasets with higher outliers level.
Similar content being viewed by others
Data availability
All data included in this study is available from the first author and can also be found on the website provided in the manuscript.
Notes
The Matlab code for our algorithm is available at: https://github.com/kaboqeboo/ASRELM.
References
Huang G, Zhu Q, Siew C (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. IEEE Int Joint Confer Neural Netw 2:985–990
Huang G, Zhu Q, Siew C (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Huang G, Chen L, Siew C et al (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Zheng D, Hong Z, Wang N, Chen P (2020) An improved LDA-based elm classification for intrusion detection algorithm in IoT application. Sensors 20(6):1–19
Dhini A, Surjandari I, Kusumoputro B, Kusiak A (2022) Extreme learning machine-radial basis function (ELM-RBF) networks for diagnosing faults in a steam turbine. J Ind Prod Eng 39(7):572–580
Chen Z, Gryllias K, Li W (2019) Mechanical fault diagnosis using convolutional neural networks and extreme learning machine. Mech Syst Signal Process 133:1–21
Hazarika B, Gupta D, Berlin M (2021) A coiflet LDMR and coiflet OB-ELM for river suspended sediment load prediction. Int J Environ Sci Technol 18(9):2675–2692
Lan Y, Soh Y, Huang G (2010) Two-stage extreme learning machine for regression. Neurocomputing 73(16):3028–3038
Zhu Q, Qin A, Suganthan P, Huang G (2005) Evolutionary extreme learning machine. Pattern Recogn 38(10):1759–1763
Meer P, Stewart CV, Tyler DE (2000) Robust computer vision: an interdisciplinary challenge. Comput Vis Image Underst 78(1):1–7
Shalev-Shwartz S, Srebro N, Zhang T (2010) Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM J Optim 20(6):2807–2832
Nešetřil J, De Mendez PO (2012) Sparsity: graphs, structures, and algorithms, vol 28. Springer
Hoefler T, Alistarh D, Ben-Nun T, Dryden N, Peste A (2021) Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. J Mach Learn Res 22(1):10882–11005
Boukerche A, Zheng L, Alfandi O (2020) Outlier detection: methods, models, and classification. ACM Comput Surv (CSUR) 53(3):1–37
Liu J, Zhou M, Wang S, Liu P (2017) A comparative study of network robustness measures. Front Comput Sci 11(4):568–584
Xu H, Mannor S (2012) Robustness and generalization. Mach Learn 86(3):391–423
Balasundaram S, Gupta D et al (2014) 1-Norm extreme learning machine for regression and multiclass classification using newton method. Neurocomputing 128:4–14
Yang L, Zhang S (2016) A sparse extreme learning machine framework by continuous optimization algorithms and its application in pattern recognition. Eng Appl Artif Intell 53:176–189
Kassani P, Teoh A, Kim E (2018) Sparse pseudoinverse incremental extreme learning machine. Neurocomputing 287:128–142
Song T, Li D, Liu Z, Yang W (2019) Online ADMM-based extreme learning machine for sparse supervised learning. IEEE Access 7:64533–64544
Zhan W, Wang K, Cao J (2023) Elastic-net based robust extreme learning machine for one-class classification. Signal Process 211:1–13
Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining, pp 389–395
Zhang K, Luo M (2015) Outlier-robust extreme learning machine for regression problems. Neurocomputing 151:1519–1527
Gupta D, Hazarika BB, Berlin M (2020) Robust regularized extreme learning machine with asymmetric huber loss function. Neural Comput Appl 32(16):12971–12998
Chen K, Lv Q, Lu Y, Dou Y (2017) Robust regularized extreme learning machine for regression using iteratively reweighted least squares. Neurocomputing 230:345–358
Zhang H, Qian F, Shang F, Du W, Qian J, Yang J (2020) Global convergence guarantees of (a) gist for a family of nonconvex sparse learning problems. IEEE Trans Cybern 52(5):3276–3288
Lu C, Zhu C, Xu C, Yan S, Lin Z (2015) Generalized singular value thresholding. In: Proceedings of the AAAI conference on artificial intelligence, vol 29
Zhang H, Qian F, Shi P, Du W, Tang Y, Qian J, Gong C, Yang J (2022) Generalized nonconvex nonsmooth low-rank matrix recovery framework with feasible algorithm designs and convergence analysis. IEEE Trans Neural Netw Learn Syst 34(9):5342–5353
Zhang H, Qian J, Gao J, Yang J, Xu C (2019) Scalable proximal Jacobian iteration method with global convergence analysis for nonconvex unconstrained composite optimizations. IEEE Trans Neural Netw Learn Syst 30(9):2825–2839
Zhang H, Gong C, Qian J, Zhang B, Xu C, Yang J (2019) Efficient recovery of low-rank matrix via double nonconvex nonsmooth rank minimization. IEEE Trans Neural Netw Learn Syst 30(10):2916–2925
Ren Z, Yang L (2019) Robust extreme learning machines with different loss functions. Neural Process Lett 49(3):1543–1565
Wang K, Cao J, Pei H (2020) Robust extreme learning machine in the presence of outliers by iterative reweighted algorithm. Appl Math Comput 377:1–13
Horst R, Thoai N (1999) Dc programming: overview. J Optim Theory Appl 103(1):1–43
An L, Tao P (2005) The DC (difference of convex functions) programming and DCA revisited with dc models of real world nonconvex optimization problems. Ann Oper Res 133(1):23–46
Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936
Huang G, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern Part B (Cybern) 42(2):513–529
Balasundaram S, Gupta D (2016) On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int J Mach Learn Cybern 7(5):707–728
Natarajan B (1995) Sparse approximate solutions to linear systems. SIAM J Comput 24(2):227–234
Karahanoglu NB, Erdogan H (2013) Compressed sensing signal recovery via forward-backward pursuit. Digit Signal Process 23(5):1539–1548
Wang Y, Li D, Du Y, Pan Z (2015) Anomaly detection in traffic using l1-norm minimization extreme learning machine. Neurocomputing 149:415–425
Yap P-T, Zhang Y, Shen D (2016) Multi-tissue decomposition of diffusion MRI signals via \(\ell _{0}\) sparse-group estimation. IEEE Trans Image Process 25(9):4340–4353
Mangasarian O (1996) Machine learning via polyhedral concave minimization. In: Applied Mathematics and Parallel Computing, pp 175–188
An L, Le H, Nguyen V, Tao P (2008) A dc programming approach for feature selection in support vector machines learning. Adv Data Anal Classif 2(3):259–278
Alquier P, Cottet V, Lecué G (2019) Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions. Ann Stat 47(4):2117–2144
Suzumura S, Ogawa K, Sugiyama M, Takeuchi I (2014) Outlier path: a homotopy algorithm for robust SVM. In: International conference on machine learning. PMLR, pp 1098–1106
Collobert R, Sinz F, Weston J, Bottou L, Joachims T (2006) Large scale transductive SVMs. J Mach Learn Res 7(8):1687–1712
Feng Y, Yang Y, Huang X, Mehrkanoon S, Suykens JA (2016) Robust support vector machines for classification with nonconvex and smooth losses. Neural Comput 28(6):1217–1247
Nikolova M, Ng M (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966
Wang K, Pei H, Cao J, Zhong P (2020) Robust regularized extreme learning machine for regression with non-convex loss function via dc program. J Frankl Inst 357(11):7069–7091
Hodson T (2022) Root mean square error (RMSE) or mean absolute error (MAE): when to use them or not. Geosci Model Dev Discuss 15(14):5481–5487
Liu M, Shao Y, Wang Z, Li C, Chen W (2018) Minimum deviation distribution machine for large scale regression. Knowl-Based Syst 146:167–180
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets’’ for all pairwise comparisons. J Mach Learn Res 9(12):2677–2694
Acknowledgements
The work was supported by the National Natural Science Foundation of China under Grant Nos. 61833005 and 61907033, the Postdoctoral Science Foundation of China under Grant No. 2018M642129, and the Postgraduate Innovation and Practice Ability Development Fund of Xi’an Shiyou University under Grant No. YCS22213168.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Xiaoxue Wang: Conceptualisation of this study, Methodology, Software, Writting the manuscript. Kuaini Wang: Conceptualisation of this study, Research design, Editing of manuscript. Yanhong She and Jinde Cao: Data curation, Supervision, Editing of manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, X., Wang, K., She, Y. et al. Zero-Norm ELM with Non-convex Quadratic Loss Function for Sparse and Robust Regression. Neural Process Lett 55, 12367–12399 (2023). https://doi.org/10.1007/s11063-023-11424-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11424-9