Skip to main content
Log in

Classifying multiclass imbalanced data using generalized class-specific extreme learning machine

  • Regular Paper
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

Learning from the imbalanced problem is among the most attractive issues in the contemporary machine learning community. However, the extensive majority of attention in this domain is given to the two-class imbalanced problems, while their much more complex multiclass counterparts are comparatively unexplored. It has been shown (Huang et al. in IEEE Trans Syst Man Cybern B (Cybern) 42(2):513–529, 2012) that extreme learning machine (ELM) achieves much better generalization performance compared to support vector machine (SVM) and least-squares support vector machine (LS-SVM) for multiclass classification problems. On this account, this work proposes a novel generalized class-specific extreme learning machine (GCS-ELM), the extension of our recently proposed, class-specific extreme learning machine (CS-ELM) to address the multiclass imbalanced problems more effectively. The proposed GCS-ELM can be applied directly to the multiclass imbalance problems. The proposed method also has reduced computational cost compared to the weighted extreme learning machine (WELM) for multiclass imbalance problems. The proposed method uses class-specific regularization coefficients, which are computed by employing class distribution. The proposed method has lower computational overhead compared to the class-specific cost regulation extreme learning machine (CCR-ELM). The proposed work is assessed by using benchmark real-world imbalanced datasets downloaded from the well-known KEEL dataset repository and synthetic datasets. The experimental results, supported by the extensive statistical analysis, demonstrate that GCS-ELM is capable to improve the generalization performance for multiclass imbalanced classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 42(2), 513–529 (2012)

    Article  Google Scholar 

  2. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  3. Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)

    Article  Google Scholar 

  4. Das, S., Datta, S., Chaudhuri, B.B.: Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit. 81, 674–693 (2018)

    Article  Google Scholar 

  5. Parvin, H., Minaei-Bidgoli, B., Alizadeh, H.: Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) Rough Sets and Knowledge Technology, pp. 376–381. Springer, Berlin (2011)

    Chapter  Google Scholar 

  6. Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2), 195–215 (1998)

    Article  Google Scholar 

  7. Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62(2), 434–443 (2013)

    Article  Google Scholar 

  8. Krawczyk, B., Galar, M., Jele, L., Herrera, F.: Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl. Soft Comput. 38(C), 714–726 (2016)

    Article  Google Scholar 

  9. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)

    Article  Google Scholar 

  10. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B (Cybern.) 39(2), 539–550 (2009)

    Article  Google Scholar 

  11. Krawczyk, B., Koziarski, M., Woźniak, M.: Radial-based oversampling for multiclass imbalanced data classification. IEEE Trans. Neural Netw. Learn. Syst. 31(8), 2818–2831 (2020)

    Article  MathSciNet  Google Scholar 

  12. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)

    MATH  Google Scholar 

  13. He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328 (2008)

  14. Han, H., Wang, W.Y., Mao, B.H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang, D.S., Zhang, X.P., Huang, G.B. (eds.) Advances in Intelligent Computing, pp. 878–887. Springer, Berlin (2005)

    Chapter  Google Scholar 

  15. Zhou, Z.-H., Liu, X.-Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)

    Article  Google Scholar 

  16. Lin, M., Tang, K., Yao, X.: Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans. Neural Netw. Learn. Syst. 24(4), 647–660 (2013)

    Article  Google Scholar 

  17. Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: Svms modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 39(1), 281–288 (2009)

    Article  Google Scholar 

  18. Cieslak, D.A., Hoens, T.R., Chawla, N.V., Kegelmeyer, W.P.: Hellinger distance decision trees are robust and skew-insensitive. Data Min. Knowl. Disc. 24(1), 136–158 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 40(12), 3358–3378 (2007)

    Article  MATH  Google Scholar 

  20. Zong, W., Huang, G.B., Chen, Y.: Weighted extreme learning machine for imbalance learning. Neurocomputing 101, 229–242 (2013)

    Article  Google Scholar 

  21. Yang, X., Song, Q., Wang, Y.: A weighted support vector machine for data classification. Int. J. Pattern Recognit. Artif. Intell. 21(05), 961–976 (2007)

    Article  Google Scholar 

  22. Lim, P., Goh, C.K., Tan, K.C.: Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE Trans. Cybern. 47(9), 2850–2861 (2017)

    Article  Google Scholar 

  23. Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. B (Cybern.) 42(4), 1119–1130 (2012)

    Article  Google Scholar 

  24. Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 28(1), 238–251 (2016)

    Article  Google Scholar 

  25. Sáez, J.A., Krawczyk, B., Woźniak, M.: Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recognit. 57, 164–178 (2016)

    Article  Google Scholar 

  26. Fürnkranz, J.: Round robin classification. J. Mach. Learn. Res. 2, 721–747 (2002)

    MathSciNet  MATH  Google Scholar 

  27. Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit. 44(8), 1761–1776 (2011)

    Article  Google Scholar 

  28. Sen, A., Islam, M.M., Murase, K., Yao, X.: Binarization with boosting and oversampling for multiclass classification. IEEE Trans. Cybern. 46(5), 1078–1091 (2016)

    Article  Google Scholar 

  29. Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)

    Article  Google Scholar 

  30. Janakiraman, V.M., Nguyen, X., Sterniak, J., Assanis, D.: Identification of the dynamic operating envelope of hcci engines using class imbalance learning. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 98–112 (2015)

    Article  MathSciNet  Google Scholar 

  31. Janakiraman, V.M., Nguyen, X., Assanis, D.: Stochastic gradient based extreme learning machines for stable online learning of advanced combustion engines. Neurocomputing 177, 304–316 (2016)

    Article  Google Scholar 

  32. Li, K., Kong, X., Lu, Z., Wenyin, L., Yin, J.: Boosting weighted ELM for imbalanced learning. Neurocomputing 128, 15–21 (2014)

    Article  Google Scholar 

  33. Xiao, W., Zhang, J., Li, Y., Zhang, S., Yang, W.: Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261, 70–82 (2017)

    Article  Google Scholar 

  34. Raghuwanshi, B.S., Shukla, S.: Underbagging based reduced kernelized weighted extreme learning machine for class imbalance learning. Eng. Appl. Artif. Intell. 74, 252–270 (2018)

    Article  Google Scholar 

  35. Raghuwanshi, B.S., Shukla, S.: Generalized class-specific kernelized extreme learning machine for multiclass imbalanced learning. Expert Syst. Appl. 121, 244–255 (2019)

    Article  Google Scholar 

  36. Raghuwanshi, B.S., Shukla, S.: Class imbalance learning using underbagging based kernelized extreme learning machine. Neurocomputing 329, 172–187 (2019)

    Article  Google Scholar 

  37. Raghuwanshi, B.S., Shukla, S.: Class-specific cost-sensitive boosting weighted elm for class imbalance learning. Memet. Comput. 11(3), 263–283 (2019)

    Article  Google Scholar 

  38. Raghuwanshi, B.S., Shukla, S.: Classifying imbalanced data using balance cascade-based kernelized extreme learning machine. Pattern Anal. Appl. 23(3), 1157–1182 (2020)

    Article  MathSciNet  Google Scholar 

  39. Raghuwanshi, B.S., Shukla, S.: Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine. Int. J. Mach. Learn. Cybern. 10(11), 3071–3097 (2019)

    Article  Google Scholar 

  40. Shukla, S., Raghuwanshi, B.S.: Online sequential class-specific extreme learning machine for binary imbalanced learning. Neural Netw. 119, 235–248 (2019)

    Article  Google Scholar 

  41. He, H., Ma, Y.: Class Imbalance Learning Methods for Support Vector Machines, p. 216. Wiley, Hoboken (2013)

    Google Scholar 

  42. Raghuwanshi, B.S., Shukla, S.: Class-specific extreme learning machine for handling binary class imbalance problem. Neural Netw. 105, 206–217 (2018)

    Article  MATH  Google Scholar 

  43. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1), 80–86 (2000)

    Article  MATH  Google Scholar 

  44. Raghuwanshi, B.S., Shukla, S.: Class-specific kernelized extreme learning machine for binary class imbalance learning. Appl. Soft Comput. 73, 1026–1038 (2018)

    Article  MATH  Google Scholar 

  45. Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)

    Article  Google Scholar 

  46. Macià, N., Bernadó-Mansilla, E., Orriols-Puig, A., Ho, T.K.: Learner excellence biased by data set selection: a case for data characterisation and artificial data sets. Pattern Recognit. 46(3), 1054–1066 (2013)

    Article  Google Scholar 

  47. Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)

    Article  Google Scholar 

  48. Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult.-Valued Log. Soft Comput. 17(2–3), 255–287 (2011)

    Google Scholar 

  49. Dua, D., Graff, C.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2017). http://archive.ics.uci.edu/ml. Accessed 26 Feb 2021

  50. Yuan, X., Xie, L., Abouelenien, M.: A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit. 77, 160–172 (2018)

    Article  Google Scholar 

  51. Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: van Someren, M., Widmer, G. (eds.) Machine Learning: ECML-97, pp. 146–153. Springer, Berlin (1997)

    Chapter  Google Scholar 

  52. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  53. Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)

    Article  Google Scholar 

  54. Hand, D.J., Till, R.J.: A simple generalisation of the area under the roc curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)

    Article  MATH  Google Scholar 

  55. Tang, K., Wang, R., Chen, T.: Towards maximizing the area under the roc curve for multi-class classification problems. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI’11, pp. 483–488 (2011)

  56. Seiffert, C., Khoshgoftaar, T.M., Hulse, J.V., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(1), 185–197 (2010)

    Article  Google Scholar 

  57. Mathew, J., Pang, C.K., Luo, M., Leong, W.H.: Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 29(9), 4065–4076 (2018)

    Article  Google Scholar 

  58. Nanni, L., Fantozzi, C., Lazzarini, N.: Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158(C), 48–61 (2015)

    Article  Google Scholar 

  59. Fernández-Navarro, F., Hervás-Martínez, C., Gutiérrez, P.A.: A dynamic over-sampling procedure based on sensitivity for multi-class problems. Pattern Recognit. 44(8), 1821–1833 (2011)

    Article  MATH  Google Scholar 

  60. Datta, S., Das, S.: Near-Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw. 70, 39–52 (2015)

    Article  MATH  Google Scholar 

  61. Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE Symposium on Computational Intelligence and Data Mining, pp. 324–331 (2009)

  62. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  63. Gregory, W., Foreman, D.: Nonparametric statistics for non-statisticians. Wiley, Hoboken (2009). https://doi.org/10.1002/9781118165881

    Book  MATH  Google Scholar 

  64. Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognit. 46(12), 3460–3471 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bhagat Singh Raghuwanshi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raghuwanshi, B.S., Shukla, S. Classifying multiclass imbalanced data using generalized class-specific extreme learning machine. Prog Artif Intell 10, 259–281 (2021). https://doi.org/10.1007/s13748-021-00236-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-021-00236-4

Keywords

Navigation