Abstract
Learning from the imbalanced problem is among the most attractive issues in the contemporary machine learning community. However, the extensive majority of attention in this domain is given to the two-class imbalanced problems, while their much more complex multiclass counterparts are comparatively unexplored. It has been shown (Huang et al. in IEEE Trans Syst Man Cybern B (Cybern) 42(2):513–529, 2012) that extreme learning machine (ELM) achieves much better generalization performance compared to support vector machine (SVM) and least-squares support vector machine (LS-SVM) for multiclass classification problems. On this account, this work proposes a novel generalized class-specific extreme learning machine (GCS-ELM), the extension of our recently proposed, class-specific extreme learning machine (CS-ELM) to address the multiclass imbalanced problems more effectively. The proposed GCS-ELM can be applied directly to the multiclass imbalance problems. The proposed method also has reduced computational cost compared to the weighted extreme learning machine (WELM) for multiclass imbalance problems. The proposed method uses class-specific regularization coefficients, which are computed by employing class distribution. The proposed method has lower computational overhead compared to the class-specific cost regulation extreme learning machine (CCR-ELM). The proposed work is assessed by using benchmark real-world imbalanced datasets downloaded from the well-known KEEL dataset repository and synthetic datasets. The experimental results, supported by the extensive statistical analysis, demonstrate that GCS-ELM is capable to improve the generalization performance for multiclass imbalanced classification problems.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 42(2), 513–529 (2012)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)
Das, S., Datta, S., Chaudhuri, B.B.: Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit. 81, 674–693 (2018)
Parvin, H., Minaei-Bidgoli, B., Alizadeh, H.: Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) Rough Sets and Knowledge Technology, pp. 376–381. Springer, Berlin (2011)
Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2), 195–215 (1998)
Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62(2), 434–443 (2013)
Krawczyk, B., Galar, M., Jele, L., Herrera, F.: Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl. Soft Comput. 38(C), 714–726 (2016)
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B (Cybern.) 39(2), 539–550 (2009)
Krawczyk, B., Koziarski, M., Woźniak, M.: Radial-based oversampling for multiclass imbalanced data classification. IEEE Trans. Neural Netw. Learn. Syst. 31(8), 2818–2831 (2020)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)
He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328 (2008)
Han, H., Wang, W.Y., Mao, B.H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang, D.S., Zhang, X.P., Huang, G.B. (eds.) Advances in Intelligent Computing, pp. 878–887. Springer, Berlin (2005)
Zhou, Z.-H., Liu, X.-Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)
Lin, M., Tang, K., Yao, X.: Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans. Neural Netw. Learn. Syst. 24(4), 647–660 (2013)
Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: Svms modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 39(1), 281–288 (2009)
Cieslak, D.A., Hoens, T.R., Chawla, N.V., Kegelmeyer, W.P.: Hellinger distance decision trees are robust and skew-insensitive. Data Min. Knowl. Disc. 24(1), 136–158 (2012)
Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 40(12), 3358–3378 (2007)
Zong, W., Huang, G.B., Chen, Y.: Weighted extreme learning machine for imbalance learning. Neurocomputing 101, 229–242 (2013)
Yang, X., Song, Q., Wang, Y.: A weighted support vector machine for data classification. Int. J. Pattern Recognit. Artif. Intell. 21(05), 961–976 (2007)
Lim, P., Goh, C.K., Tan, K.C.: Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE Trans. Cybern. 47(9), 2850–2861 (2017)
Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. B (Cybern.) 42(4), 1119–1130 (2012)
Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 28(1), 238–251 (2016)
Sáez, J.A., Krawczyk, B., Woźniak, M.: Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recognit. 57, 164–178 (2016)
Fürnkranz, J.: Round robin classification. J. Mach. Learn. Res. 2, 721–747 (2002)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit. 44(8), 1761–1776 (2011)
Sen, A., Islam, M.M., Murase, K., Yao, X.: Binarization with boosting and oversampling for multiclass classification. IEEE Trans. Cybern. 46(5), 1078–1091 (2016)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)
Janakiraman, V.M., Nguyen, X., Sterniak, J., Assanis, D.: Identification of the dynamic operating envelope of hcci engines using class imbalance learning. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 98–112 (2015)
Janakiraman, V.M., Nguyen, X., Assanis, D.: Stochastic gradient based extreme learning machines for stable online learning of advanced combustion engines. Neurocomputing 177, 304–316 (2016)
Li, K., Kong, X., Lu, Z., Wenyin, L., Yin, J.: Boosting weighted ELM for imbalanced learning. Neurocomputing 128, 15–21 (2014)
Xiao, W., Zhang, J., Li, Y., Zhang, S., Yang, W.: Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261, 70–82 (2017)
Raghuwanshi, B.S., Shukla, S.: Underbagging based reduced kernelized weighted extreme learning machine for class imbalance learning. Eng. Appl. Artif. Intell. 74, 252–270 (2018)
Raghuwanshi, B.S., Shukla, S.: Generalized class-specific kernelized extreme learning machine for multiclass imbalanced learning. Expert Syst. Appl. 121, 244–255 (2019)
Raghuwanshi, B.S., Shukla, S.: Class imbalance learning using underbagging based kernelized extreme learning machine. Neurocomputing 329, 172–187 (2019)
Raghuwanshi, B.S., Shukla, S.: Class-specific cost-sensitive boosting weighted elm for class imbalance learning. Memet. Comput. 11(3), 263–283 (2019)
Raghuwanshi, B.S., Shukla, S.: Classifying imbalanced data using balance cascade-based kernelized extreme learning machine. Pattern Anal. Appl. 23(3), 1157–1182 (2020)
Raghuwanshi, B.S., Shukla, S.: Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine. Int. J. Mach. Learn. Cybern. 10(11), 3071–3097 (2019)
Shukla, S., Raghuwanshi, B.S.: Online sequential class-specific extreme learning machine for binary imbalanced learning. Neural Netw. 119, 235–248 (2019)
He, H., Ma, Y.: Class Imbalance Learning Methods for Support Vector Machines, p. 216. Wiley, Hoboken (2013)
Raghuwanshi, B.S., Shukla, S.: Class-specific extreme learning machine for handling binary class imbalance problem. Neural Netw. 105, 206–217 (2018)
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1), 80–86 (2000)
Raghuwanshi, B.S., Shukla, S.: Class-specific kernelized extreme learning machine for binary class imbalance learning. Appl. Soft Comput. 73, 1026–1038 (2018)
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)
Macià, N., Bernadó-Mansilla, E., Orriols-Puig, A., Ho, T.K.: Learner excellence biased by data set selection: a case for data characterisation and artificial data sets. Pattern Recognit. 46(3), 1054–1066 (2013)
Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)
Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult.-Valued Log. Soft Comput. 17(2–3), 255–287 (2011)
Dua, D., Graff, C.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2017). http://archive.ics.uci.edu/ml. Accessed 26 Feb 2021
Yuan, X., Xie, L., Abouelenien, M.: A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit. 77, 160–172 (2018)
Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: van Someren, M., Widmer, G. (eds.) Machine Learning: ECML-97, pp. 146–153. Springer, Berlin (1997)
Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)
Hand, D.J., Till, R.J.: A simple generalisation of the area under the roc curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)
Tang, K., Wang, R., Chen, T.: Towards maximizing the area under the roc curve for multi-class classification problems. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI’11, pp. 483–488 (2011)
Seiffert, C., Khoshgoftaar, T.M., Hulse, J.V., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(1), 185–197 (2010)
Mathew, J., Pang, C.K., Luo, M., Leong, W.H.: Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 29(9), 4065–4076 (2018)
Nanni, L., Fantozzi, C., Lazzarini, N.: Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158(C), 48–61 (2015)
Fernández-Navarro, F., Hervás-Martínez, C., Gutiérrez, P.A.: A dynamic over-sampling procedure based on sensitivity for multi-class problems. Pattern Recognit. 44(8), 1821–1833 (2011)
Datta, S., Das, S.: Near-Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw. 70, 39–52 (2015)
Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE Symposium on Computational Intelligence and Data Mining, pp. 324–331 (2009)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Gregory, W., Foreman, D.: Nonparametric statistics for non-statisticians. Wiley, Hoboken (2009). https://doi.org/10.1002/9781118165881
Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognit. 46(12), 3460–3471 (2013)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Raghuwanshi, B.S., Shukla, S. Classifying multiclass imbalanced data using generalized class-specific extreme learning machine. Prog Artif Intell 10, 259–281 (2021). https://doi.org/10.1007/s13748-021-00236-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-021-00236-4