Skip to main content
Log in

Stochastic configuration networks for imbalanced data classification

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Stochastic configuration networks (SCNs), as a class of advanced randomized learner models, play an important role in predictive data analytics. Given an imbalanced data classification task, the original SCN classifiers may fail to provide satisfied performance because of the density difference of data distribution. This paper contributes to a development of imbalanced learning for SCNs (IL-SCNs) classifier design with skewed class distribution. Concretely, a balancer is proposed and used in IL-SCNs to compromise between the majority class and the minority class. In addition, a fast computation algorithm is adopted to update the output weights, which achieves lower computation complexity of IL-SCNs. Experimental results show that IL-SCNs significantly outperforms the existing state-of-the-art learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://sci2s.ugr.es/keel/imbalanced.php.

References

  1. Nanni L, Fantozzi C, Lazzarini N (2015) Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158:48–61

    Article  Google Scholar 

  2. Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140

    Article  Google Scholar 

  3. Abdi Y, Parsa S, Seyfari Y (2015) A hybrid one-class rule learning approach based on swarm intelligence for software fault prediction. Innovations Syst Softw Eng 11(4):289–301

    Article  Google Scholar 

  4. Dhingra K, Yadav SK (2019) Spam analysis of big reviews dataset using fuzzy ranking evaluation algorithm and Hadoop. Int J Mach Learn Cybern 10(8):2143–2162

    Article  Google Scholar 

  5. Gao X, Chen Z, Tang S, Zhang Y, Li J (2016) Adaptive weighted imbalance learning with application to abnormal activity recognition. Neurocomputing 173:1927–1935

    Article  Google Scholar 

  6. Zhou J, Liu Y, Zhang TH (2019) Fault diagnosis based on relevance vector machine for fuel regulator of aircraft engine. Int J Mach Learn Cybern 10(7):1779–1790

    Article  Google Scholar 

  7. Wang P, Su F, Zhao Z, Guo Y, Zhao Y, Zhuang B (2019) Deep class-skewed learning for face recognition. Neurocomputing 363:35–45

    Article  Google Scholar 

  8. Yi H, Jiang Q, Yan X, Wang B (2021) Imbalanced classification based on minority clustering synthetic minority oversampling technique with wind turbine fault detection application. IEEE Trans Ind Inform 17(9):5867–5875

    Article  Google Scholar 

  9. Sun Y, Wong AC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recogn 23(4):687–719

    Article  Google Scholar 

  10. Liu Y, Yu X, Huang JX, An A (2011) Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Inf Process Manag 47:617–631

    Article  Google Scholar 

  11. Alshomrani S, Bawakid A, Shim SO, Fernandez A, Herrera F (2015) A proposal for evolutionary fuzzy systems using feature weighting: dealing with overlapping in imbalanced datasets. Knowl Based Syst 73:1–17

    Article  Google Scholar 

  12. Iman N, Susana K (2016) Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets. Expert Syst Appl 46:405–416

    Article  Google Scholar 

  13. He HB, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  14. Fernández A, López V, Galar M, del Jesús MJ, Herrera F (2013) Analysing the classification of imbalanced data-sets with multiple classes: binarization techniques and ad-hoc approaches. Knowl Based Syst 42:97–110

    Article  Google Scholar 

  15. Oh SH (2011) Error back-propagation algorithm for classification of imbalanced data. Neurocomputing 74(6):1058–1061

    Article  Google Scholar 

  16. Zhu Z, Wang Z, Li D, Zhu Y, Du W (2020) Geometric structural ensemble learning for imbalanced problems. IEEE Trans Cybern 50(4):1617–1629

    Article  Google Scholar 

  17. Zheng Z, Cai Y, Li Y (2016) Oversampling method for imbalanced classification. Comput Inform 34(5):1017–1037

    Google Scholar 

  18. Pérez-Ortiz M, Gutiérrez PA, Tiño P, Hervás-Martínez C (2016) Oversampling the minority class in the feature space. IEEE Trans Neural Netw Learn Syst 27(9):1947–1961

    Article  MathSciNet  Google Scholar 

  19. Hoyos-Osorio J, Alvarez-Meza A, Daza-Santacoloma G, Orozco-Gutierrez A, Castellanos-Dominguez G (2021) Relevant information undersampling to support imbalanced data classification. Neurocomputing 436:136–146

    Article  Google Scholar 

  20. Lin WC, Tsai CF, Hu YH, Jhang JS (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci 409:17–26

    Article  Google Scholar 

  21. Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recogn 48(5):1623–1637

    Article  Google Scholar 

  22. Guo H, Li Y, Shang J, Gu M, Huang Y, Gong B (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239

    Article  Google Scholar 

  23. Wang B, Pineau J (2016) Online bagging and boosting for imbalanced data streams. IEEE Trans Knowl Data Eng 28(12):3353–3366

    Article  Google Scholar 

  24. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern PartC: Appl. Rev 42(4):463–484

    Article  Google Scholar 

  25. Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cybern. 47(10):3466–3479

    Article  Google Scholar 

  26. Wang D, Cui C (2017) Stochastic configuration networks ensemble with heterogeneous features for large-scale data analytics. Inf Sci 417:55–71

    Article  Google Scholar 

  27. Wang Q, Dai W, Ma X, Shang Z (2020) Driving amount based stochastic configuration network for industrial process modeling. Neurocomputing 394:61–69

    Article  Google Scholar 

  28. Wang D, Li M (2018) Deep stochastic configuration networks with universal approximation property. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2018.8489695

  29. Pratama M, Wang D (2019) Deep stacked stochastic configuration networks for lifelong learning of non-stationary data streams. Inf Sci 495:150–174

    Article  MathSciNet  Google Scholar 

  30. Lu J, Ding J (2020) Mixed-distribution based robust stochastic configuration networks for prediction interval construction. IEEE Trans Ind Inform 16(8):5099–5109

    Article  Google Scholar 

  31. Dai W, Li D, Zhou P, Chai TY (2019) Stochastic configuration networks with block increments for data modeling in process industries. Inf Sci 484:367–386

    Article  MathSciNet  Google Scholar 

  32. Lu J, Ding J, Dai X, Chai TY (2020) Ensemble stochastic configuration networks for estimating prediction intervals: a simultaneous robust training algorithm and its application. IEEE Trans Neural Netw Learn Syst 31(12):5426–5440

    Article  MathSciNet  Google Scholar 

  33. Li M, Wang D (2021) 2-D Stochastic configuration networks for image data analytics. IEEE Trans Cybern 51(1):359–372

    Article  Google Scholar 

  34. Lu J, Ding J, Liu C, Chai TY (2021) Hierarchical-Bayesian-based sparse stochastic configuration networks for construction of prediction intervals. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3053306

    Article  Google Scholar 

  35. Dai W, Zhou X, Li D, Zhu S, Wang X (2021) Hybrid parallel stochastic configuration networks for industrial data analytics. IEEE Trans Ind Inform. https://doi.org/10.1109/TII.2021.3096840

    Article  Google Scholar 

  36. Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329

    Article  Google Scholar 

  37. Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net. Neurocomputing 6(2):163–180

    Article  Google Scholar 

  38. Pao YH, Takefuji Y (1992) Functional-link net computing, theory, system architecture, and functionalities. IEEE Comput 3(5):76-79

    Article  Google Scholar 

  39. Li M, Wang D (2017) Insights into randomized algorithms for neural networks: practical issues and common pitfalls. Inf Sci 382:170–178

    Article  Google Scholar 

  40. Needell D, Nelson AA, Saab R, Salanevich P (2020) Random vector functional link networks for function approximation on manifolds. arXiv preprint, https://arxiv.org/abs/2007.15776

  41. Fontenla-Romero O, Pérez-Sánchez B, Guijarro-Berdiñas B (2018) LANN-SVD: a non-iterative SVD-based learning algorithm for one-layer neural networks. IEEE Trans Neural Netw Learn Syst 29(8):3900–3905

    Article  Google Scholar 

  42. Halko N, Martinsson PG, Tropp JA (2010) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288

    Article  MathSciNet  Google Scholar 

  43. Golub GH, Van Loan CF (1996) Matrix computation. Johns Hopkins Univ. Press, Baltimore

  44. Saad Y (1992) Numerical methods for large eigenvalue problems. Volume 66 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2011). Revised edition of the 1992 original. https://doi.org/10.1137/1.9781611970739

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61973306), Natural Science Foundation of Jiangsu Province (BK20200086), in part the National Key R&D Program of China (2018AAA0100304), the Open Project Foundation of State Key Laboratory of Synthetical Automation for Process Industries (2020-KF-21-10), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province. 

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dianhui Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, W., Ning, C., Nan, J. et al. Stochastic configuration networks for imbalanced data classification. Int. J. Mach. Learn. & Cyber. 13, 2843–2855 (2022). https://doi.org/10.1007/s13042-022-01565-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01565-z

Keywords

Navigation