Abstract
Stochastic configuration networks (SCNs), as a class of advanced randomized learner models, play an important role in predictive data analytics. Given an imbalanced data classification task, the original SCN classifiers may fail to provide satisfied performance because of the density difference of data distribution. This paper contributes to a development of imbalanced learning for SCNs (IL-SCNs) classifier design with skewed class distribution. Concretely, a balancer is proposed and used in IL-SCNs to compromise between the majority class and the minority class. In addition, a fast computation algorithm is adopted to update the output weights, which achieves lower computation complexity of IL-SCNs. Experimental results show that IL-SCNs significantly outperforms the existing state-of-the-art learning models.
Similar content being viewed by others
References
Nanni L, Fantozzi C, Lazzarini N (2015) Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158:48–61
Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140
Abdi Y, Parsa S, Seyfari Y (2015) A hybrid one-class rule learning approach based on swarm intelligence for software fault prediction. Innovations Syst Softw Eng 11(4):289–301
Dhingra K, Yadav SK (2019) Spam analysis of big reviews dataset using fuzzy ranking evaluation algorithm and Hadoop. Int J Mach Learn Cybern 10(8):2143–2162
Gao X, Chen Z, Tang S, Zhang Y, Li J (2016) Adaptive weighted imbalance learning with application to abnormal activity recognition. Neurocomputing 173:1927–1935
Zhou J, Liu Y, Zhang TH (2019) Fault diagnosis based on relevance vector machine for fuel regulator of aircraft engine. Int J Mach Learn Cybern 10(7):1779–1790
Wang P, Su F, Zhao Z, Guo Y, Zhao Y, Zhuang B (2019) Deep class-skewed learning for face recognition. Neurocomputing 363:35–45
Yi H, Jiang Q, Yan X, Wang B (2021) Imbalanced classification based on minority clustering synthetic minority oversampling technique with wind turbine fault detection application. IEEE Trans Ind Inform 17(9):5867–5875
Sun Y, Wong AC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recogn 23(4):687–719
Liu Y, Yu X, Huang JX, An A (2011) Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Inf Process Manag 47:617–631
Alshomrani S, Bawakid A, Shim SO, Fernandez A, Herrera F (2015) A proposal for evolutionary fuzzy systems using feature weighting: dealing with overlapping in imbalanced datasets. Knowl Based Syst 73:1–17
Iman N, Susana K (2016) Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets. Expert Syst Appl 46:405–416
He HB, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Fernández A, López V, Galar M, del Jesús MJ, Herrera F (2013) Analysing the classification of imbalanced data-sets with multiple classes: binarization techniques and ad-hoc approaches. Knowl Based Syst 42:97–110
Oh SH (2011) Error back-propagation algorithm for classification of imbalanced data. Neurocomputing 74(6):1058–1061
Zhu Z, Wang Z, Li D, Zhu Y, Du W (2020) Geometric structural ensemble learning for imbalanced problems. IEEE Trans Cybern 50(4):1617–1629
Zheng Z, Cai Y, Li Y (2016) Oversampling method for imbalanced classification. Comput Inform 34(5):1017–1037
Pérez-Ortiz M, Gutiérrez PA, Tiño P, Hervás-Martínez C (2016) Oversampling the minority class in the feature space. IEEE Trans Neural Netw Learn Syst 27(9):1947–1961
Hoyos-Osorio J, Alvarez-Meza A, Daza-Santacoloma G, Orozco-Gutierrez A, Castellanos-Dominguez G (2021) Relevant information undersampling to support imbalanced data classification. Neurocomputing 436:136–146
Lin WC, Tsai CF, Hu YH, Jhang JS (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci 409:17–26
Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recogn 48(5):1623–1637
Guo H, Li Y, Shang J, Gu M, Huang Y, Gong B (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239
Wang B, Pineau J (2016) Online bagging and boosting for imbalanced data streams. IEEE Trans Knowl Data Eng 28(12):3353–3366
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern PartC: Appl. Rev 42(4):463–484
Wang D, Li M (2017) Stochastic configuration networks: fundamentals and algorithms. IEEE Trans Cybern. 47(10):3466–3479
Wang D, Cui C (2017) Stochastic configuration networks ensemble with heterogeneous features for large-scale data analytics. Inf Sci 417:55–71
Wang Q, Dai W, Ma X, Shang Z (2020) Driving amount based stochastic configuration network for industrial process modeling. Neurocomputing 394:61–69
Wang D, Li M (2018) Deep stochastic configuration networks with universal approximation property. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2018.8489695
Pratama M, Wang D (2019) Deep stacked stochastic configuration networks for lifelong learning of non-stationary data streams. Inf Sci 495:150–174
Lu J, Ding J (2020) Mixed-distribution based robust stochastic configuration networks for prediction interval construction. IEEE Trans Ind Inform 16(8):5099–5109
Dai W, Li D, Zhou P, Chai TY (2019) Stochastic configuration networks with block increments for data modeling in process industries. Inf Sci 484:367–386
Lu J, Ding J, Dai X, Chai TY (2020) Ensemble stochastic configuration networks for estimating prediction intervals: a simultaneous robust training algorithm and its application. IEEE Trans Neural Netw Learn Syst 31(12):5426–5440
Li M, Wang D (2021) 2-D Stochastic configuration networks for image data analytics. IEEE Trans Cybern 51(1):359–372
Lu J, Ding J, Liu C, Chai TY (2021) Hierarchical-Bayesian-based sparse stochastic configuration networks for construction of prediction intervals. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3053306
Dai W, Zhou X, Li D, Zhu S, Wang X (2021) Hybrid parallel stochastic configuration networks for industrial data analytics. IEEE Trans Ind Inform. https://doi.org/10.1109/TII.2021.3096840
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329
Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net. Neurocomputing 6(2):163–180
Pao YH, Takefuji Y (1992) Functional-link net computing, theory, system architecture, and functionalities. IEEE Comput 3(5):76-79
Li M, Wang D (2017) Insights into randomized algorithms for neural networks: practical issues and common pitfalls. Inf Sci 382:170–178
Needell D, Nelson AA, Saab R, Salanevich P (2020) Random vector functional link networks for function approximation on manifolds. arXiv preprint, https://arxiv.org/abs/2007.15776
Fontenla-Romero O, Pérez-Sánchez B, Guijarro-Berdiñas B (2018) LANN-SVD: a non-iterative SVD-based learning algorithm for one-layer neural networks. IEEE Trans Neural Netw Learn Syst 29(8):3900–3905
Halko N, Martinsson PG, Tropp JA (2010) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288
Golub GH, Van Loan CF (1996) Matrix computation. Johns Hopkins Univ. Press, Baltimore
Saad Y (1992) Numerical methods for large eigenvalue problems. Volume 66 of Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2011). Revised edition of the 1992 original. https://doi.org/10.1137/1.9781611970739
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61973306), Natural Science Foundation of Jiangsu Province (BK20200086), in part the National Key R&D Program of China (2018AAA0100304), the Open Project Foundation of State Key Laboratory of Synthetical Automation for Process Industries (2020-KF-21-10), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dai, W., Ning, C., Nan, J. et al. Stochastic configuration networks for imbalanced data classification. Int. J. Mach. Learn. & Cyber. 13, 2843–2855 (2022). https://doi.org/10.1007/s13042-022-01565-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01565-z