Skip to main content
Log in

Zero-Norm ELM with Non-convex Quadratic Loss Function for Sparse and Robust Regression

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Extreme learning machine (ELM) is a machine learning technique with simple structure, fast learning speed, and excellent generalization ability, which has received a lot of attention since it was proposed. In order to further improve the sparsity of output weights and the robustness of the model, this paper proposes a sparse and robust ELM based on zero-norm regularization and a non-convex quadratic loss function. The zero-norm regularization obtains sparse hidden nodes automatically, and the introduced non-convex quadratic loss function enhances the robustness by setting constant penalties to outliers. The optimization problem can be formulated as the difference of convex functions (DC) programming. This DC programming is solved by using the DC algorithm (DCA) in this paper. The experiments on the artificial and Benchmark datasets verify that the proposed method has promising robustness while reducing the number of hidden nodes, especially on the datasets with higher outliers level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

All data included in this study is available from the first author and can also be found on the website provided in the manuscript.

Notes

  1. The Matlab code for our algorithm is available at: https://github.com/kaboqeboo/ASRELM.

  2. https://www.dcc.fc.up.pt/~ltorgo/Regression/DataSets.html.

  3. https://www.kaggle.com/datasets.

  4. http://lib.stat.cmu.edu/datasets/.

  5. http://archive.ics.uci.edu/ml/index.php.

References

  1. Huang G, Zhu Q, Siew C (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. IEEE Int Joint Confer Neural Netw 2:985–990

    Google Scholar 

  2. Huang G, Zhu Q, Siew C (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501

    Google Scholar 

  3. Huang G, Chen L, Siew C et al (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Google Scholar 

  4. Zheng D, Hong Z, Wang N, Chen P (2020) An improved LDA-based elm classification for intrusion detection algorithm in IoT application. Sensors 20(6):1–19

    Google Scholar 

  5. Dhini A, Surjandari I, Kusumoputro B, Kusiak A (2022) Extreme learning machine-radial basis function (ELM-RBF) networks for diagnosing faults in a steam turbine. J Ind Prod Eng 39(7):572–580

    Google Scholar 

  6. Chen Z, Gryllias K, Li W (2019) Mechanical fault diagnosis using convolutional neural networks and extreme learning machine. Mech Syst Signal Process 133:1–21

    Google Scholar 

  7. Hazarika B, Gupta D, Berlin M (2021) A coiflet LDMR and coiflet OB-ELM for river suspended sediment load prediction. Int J Environ Sci Technol 18(9):2675–2692

    Google Scholar 

  8. Lan Y, Soh Y, Huang G (2010) Two-stage extreme learning machine for regression. Neurocomputing 73(16):3028–3038

    Google Scholar 

  9. Zhu Q, Qin A, Suganthan P, Huang G (2005) Evolutionary extreme learning machine. Pattern Recogn 38(10):1759–1763

    Google Scholar 

  10. Meer P, Stewart CV, Tyler DE (2000) Robust computer vision: an interdisciplinary challenge. Comput Vis Image Underst 78(1):1–7

    Google Scholar 

  11. Shalev-Shwartz S, Srebro N, Zhang T (2010) Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM J Optim 20(6):2807–2832

    MathSciNet  Google Scholar 

  12. Nešetřil J, De Mendez PO (2012) Sparsity: graphs, structures, and algorithms, vol 28. Springer

  13. Hoefler T, Alistarh D, Ben-Nun T, Dryden N, Peste A (2021) Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. J Mach Learn Res 22(1):10882–11005

    MathSciNet  Google Scholar 

  14. Boukerche A, Zheng L, Alfandi O (2020) Outlier detection: methods, models, and classification. ACM Comput Surv (CSUR) 53(3):1–37

    Google Scholar 

  15. Liu J, Zhou M, Wang S, Liu P (2017) A comparative study of network robustness measures. Front Comput Sci 11(4):568–584

    Google Scholar 

  16. Xu H, Mannor S (2012) Robustness and generalization. Mach Learn 86(3):391–423

    MathSciNet  Google Scholar 

  17. Balasundaram S, Gupta D et al (2014) 1-Norm extreme learning machine for regression and multiclass classification using newton method. Neurocomputing 128:4–14

    Google Scholar 

  18. Yang L, Zhang S (2016) A sparse extreme learning machine framework by continuous optimization algorithms and its application in pattern recognition. Eng Appl Artif Intell 53:176–189

    Google Scholar 

  19. Kassani P, Teoh A, Kim E (2018) Sparse pseudoinverse incremental extreme learning machine. Neurocomputing 287:128–142

    Google Scholar 

  20. Song T, Li D, Liu Z, Yang W (2019) Online ADMM-based extreme learning machine for sparse supervised learning. IEEE Access 7:64533–64544

    Google Scholar 

  21. Zhan W, Wang K, Cao J (2023) Elastic-net based robust extreme learning machine for one-class classification. Signal Process 211:1–13

    Google Scholar 

  22. Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining, pp 389–395

  23. Zhang K, Luo M (2015) Outlier-robust extreme learning machine for regression problems. Neurocomputing 151:1519–1527

    Google Scholar 

  24. Gupta D, Hazarika BB, Berlin M (2020) Robust regularized extreme learning machine with asymmetric huber loss function. Neural Comput Appl 32(16):12971–12998

    Google Scholar 

  25. Chen K, Lv Q, Lu Y, Dou Y (2017) Robust regularized extreme learning machine for regression using iteratively reweighted least squares. Neurocomputing 230:345–358

    Google Scholar 

  26. Zhang H, Qian F, Shang F, Du W, Qian J, Yang J (2020) Global convergence guarantees of (a) gist for a family of nonconvex sparse learning problems. IEEE Trans Cybern 52(5):3276–3288

    Google Scholar 

  27. Lu C, Zhu C, Xu C, Yan S, Lin Z (2015) Generalized singular value thresholding. In: Proceedings of the AAAI conference on artificial intelligence, vol 29

  28. Zhang H, Qian F, Shi P, Du W, Tang Y, Qian J, Gong C, Yang J (2022) Generalized nonconvex nonsmooth low-rank matrix recovery framework with feasible algorithm designs and convergence analysis. IEEE Trans Neural Netw Learn Syst 34(9):5342–5353

    MathSciNet  Google Scholar 

  29. Zhang H, Qian J, Gao J, Yang J, Xu C (2019) Scalable proximal Jacobian iteration method with global convergence analysis for nonconvex unconstrained composite optimizations. IEEE Trans Neural Netw Learn Syst 30(9):2825–2839

    MathSciNet  Google Scholar 

  30. Zhang H, Gong C, Qian J, Zhang B, Xu C, Yang J (2019) Efficient recovery of low-rank matrix via double nonconvex nonsmooth rank minimization. IEEE Trans Neural Netw Learn Syst 30(10):2916–2925

    MathSciNet  Google Scholar 

  31. Ren Z, Yang L (2019) Robust extreme learning machines with different loss functions. Neural Process Lett 49(3):1543–1565

    Google Scholar 

  32. Wang K, Cao J, Pei H (2020) Robust extreme learning machine in the presence of outliers by iterative reweighted algorithm. Appl Math Comput 377:1–13

    MathSciNet  Google Scholar 

  33. Horst R, Thoai N (1999) Dc programming: overview. J Optim Theory Appl 103(1):1–43

    MathSciNet  Google Scholar 

  34. An L, Tao P (2005) The DC (difference of convex functions) programming and DCA revisited with dc models of real world nonconvex optimization problems. Ann Oper Res 133(1):23–46

    MathSciNet  Google Scholar 

  35. Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936

    Google Scholar 

  36. Huang G, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern Part B (Cybern) 42(2):513–529

    Google Scholar 

  37. Balasundaram S, Gupta D (2016) On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int J Mach Learn Cybern 7(5):707–728

    Google Scholar 

  38. Natarajan B (1995) Sparse approximate solutions to linear systems. SIAM J Comput 24(2):227–234

    MathSciNet  Google Scholar 

  39. Karahanoglu NB, Erdogan H (2013) Compressed sensing signal recovery via forward-backward pursuit. Digit Signal Process 23(5):1539–1548

    MathSciNet  Google Scholar 

  40. Wang Y, Li D, Du Y, Pan Z (2015) Anomaly detection in traffic using l1-norm minimization extreme learning machine. Neurocomputing 149:415–425

    Google Scholar 

  41. Yap P-T, Zhang Y, Shen D (2016) Multi-tissue decomposition of diffusion MRI signals via \(\ell _{0}\) sparse-group estimation. IEEE Trans Image Process 25(9):4340–4353

    MathSciNet  Google Scholar 

  42. Mangasarian O (1996) Machine learning via polyhedral concave minimization. In: Applied Mathematics and Parallel Computing, pp 175–188

  43. An L, Le H, Nguyen V, Tao P (2008) A dc programming approach for feature selection in support vector machines learning. Adv Data Anal Classif 2(3):259–278

    MathSciNet  Google Scholar 

  44. Alquier P, Cottet V, Lecué G (2019) Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions. Ann Stat 47(4):2117–2144

    MathSciNet  Google Scholar 

  45. Suzumura S, Ogawa K, Sugiyama M, Takeuchi I (2014) Outlier path: a homotopy algorithm for robust SVM. In: International conference on machine learning. PMLR, pp 1098–1106

  46. Collobert R, Sinz F, Weston J, Bottou L, Joachims T (2006) Large scale transductive SVMs. J Mach Learn Res 7(8):1687–1712

    MathSciNet  Google Scholar 

  47. Feng Y, Yang Y, Huang X, Mehrkanoon S, Suykens JA (2016) Robust support vector machines for classification with nonconvex and smooth losses. Neural Comput 28(6):1217–1247

    MathSciNet  Google Scholar 

  48. Nikolova M, Ng M (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966

    MathSciNet  Google Scholar 

  49. Wang K, Pei H, Cao J, Zhong P (2020) Robust regularized extreme learning machine for regression with non-convex loss function via dc program. J Frankl Inst 357(11):7069–7091

    MathSciNet  Google Scholar 

  50. Hodson T (2022) Root mean square error (RMSE) or mean absolute error (MAE): when to use them or not. Geosci Model Dev Discuss 15(14):5481–5487

    Google Scholar 

  51. Liu M, Shao Y, Wang Z, Li C, Chen W (2018) Minimum deviation distribution machine for large scale regression. Knowl-Based Syst 146:167–180

    Google Scholar 

  52. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  53. Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets’’ for all pairwise comparisons. J Mach Learn Res 9(12):2677–2694

    Google Scholar 

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China under Grant Nos. 61833005 and 61907033, the Postdoctoral Science Foundation of China under Grant No. 2018M642129, and the Postgraduate Innovation and Practice Ability Development Fund of Xi’an Shiyou University under Grant No. YCS22213168.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Xiaoxue Wang: Conceptualisation of this study, Methodology, Software, Writting the manuscript. Kuaini Wang: Conceptualisation of this study, Research design, Editing of manuscript. Yanhong She and Jinde Cao: Data curation, Supervision, Editing of manuscript.

Corresponding author

Correspondence to Kuaini Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Wang, K., She, Y. et al. Zero-Norm ELM with Non-convex Quadratic Loss Function for Sparse and Robust Regression. Neural Process Lett 55, 12367–12399 (2023). https://doi.org/10.1007/s11063-023-11424-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11424-9

Keywords

Navigation