Abstract
Learning from the imbalanced data samples so as to achieve accurate classification is an important research content in data mining field. It is very difficult for classification algorithm to achieve a higher accuracy because the uneven distribution of data samples makes some categories have few samples. A imbalanced data classification algorithm of support vector machines (KE-SVM) is proposed in this article, this algorithm achieve the initial classification of data samples by training the maximum margin classification SVM model, and then obtaining a new kernel extension function. based on Chi square test and weight coefficient calculation, through training the samples again by the new vector machine with kernel function to improve the classification accuracy. Through the simulation experiments of real data sets of artificial data set, it shows that the proposed method has higher classification accuracy and faster convergence for the uneven distribution data.



Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Buczak AL, Guven E (2017) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutor 18(2):1153–1176
Papalexakis EE, Faloutsos C, Sidiropoulos ND (2016) Tensors for data mining and data fusion: models, applications, and scalable algorithms. Acm Trans Intell Syst Technol 8(2):1–44
Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-nearest neighbor (KNN) classification method. Appl Comput Inf 12(1):90–108
Deng Y, Ren Z, Kong Y et al (2017) A hierarchical fused fuzzy deep neural network for data classification. IEEE Trans Fuzzy Syst 25(4):1006–1012
Gu Y, Wang Q, Xie B (2017) Multiple kernel sparse representation for airborne LiDAR data classification. IEEE Trans Geosci Remote Sens 55(99):1–21
Pourpanah F, Lim CP, Saleh JM (2016) A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction. Expert Syst Appl 49:74–85
Zhang J, Wang S, Chen L et al (2017) Multiple Bayesian discriminant functions for high-dimensional massive data classification. Data Min Knowl Discov 31(2):1–37
Gu X, Wang S-T, Xu M (2014) A new cross-multidomain classification algorithm and its fast version for large datasets. Acta Autom Sin 40(3):531–547
Wang Z-W, Xiao W-D, Tan W-T (2013) Classification in networked data based on the probability generative mode. J Comput Res Dev 50(12):2642–2650
Shao YH, Chen WJ, Zhang JJ et al (2014) “An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167
Peng X, Xu D (2014) “Structural regularized projection twin support vector machine for data classification. Inf Sci 279(279):416–432
Zhang H, Li M (2014) RWO-sampling: a random walk over-sampling approach to imbalanced data classification. Inf Fusion 20(1):99–116
Yin Y, Xu D, Wang X et al (2017) Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Trans Cybern 45(9):1988–2000
He H, Kong F, Tan J (2017) DietCam: multi-view food recognition using a multi-kernel SVM. IEEE J Biomed Health Inf 20(3):848–855
Yoon H, Park CS, Kim JS et al (2013) Algorithm learning based neural network integrating feature selection and classification. Expert Syst Appl 40(1):231–241
Chen Y, Nasrabadi NM, Tran TD (2013) Hyperspectral image classification via kernel sparse representation. IEEE Trans Geosci Remote Sens 51(1):217–231
Zhun M, Li X-L, Li X-L (2012) A two-stage support vector machine algorithm based on meta learning and stacking generalization. Pattern Recognit Artif Intell 25:943–949
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, F., Liu, S., Ni, W. et al. Imbalanced data classification algorithm with support vector machine kernel extensions. Evol. Intel. 12, 341–347 (2019). https://doi.org/10.1007/s12065-018-0182-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-018-0182-0