Abstract
In order to resolve the classifiers’ over fitting phenomenon to enhance classification performance under imbalanced dataset, a dynamic density equalization algorithm is proposed for imbalanced data classification. According to the relationship between sample’s densities of different class, the algorithm is hierarchical clustering. First, samples of majority class are divided into multiple particles according to K-mean clustering in the kernel space. Then, cluster for every particle according to the relation between particle density and minority class. Then, replace the particle with the sample that it has highest similarity with the center of particle. Reform the new training dataset and get the final classifier. The algorithm may resolve the problem of imbalanced dataset and improve the classification performance of SVM. Experiment results with artificial dataset and four groups of UCI dataset show the algorithm is effectiveness for imbalanced dataset.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Lin, S.Y., Li, C.H., Jiang, Y.: Under-sampling method research in class-imbalanced data. J. Comput. Res. Dev. 48(7), 47–53 (2011)
Lou, X.J., Sun, Y.X., Liu, H.T.: Clustering boundary over-sampling classification method for imbalanced data sets. J. Zhejiang Univ. (Eng. Sci.) 47(6), 944–950 (2013)
Tao, X.M., Hao, S.Y., Zhang, D.X.: Kernel cluset-based ensemble SVM approaches for unbalanced data. J. Harbin Eng. Univ. 34(3), 381–388 (2013)
Zeng, Z.Q., Wu, Q., Liao, B.S.: A classfication method for imbalance data set based on kernel SMOTE. Acta Electronica Sin. 37(11), 2489–2495 (2009)
Chen, S., Guo, G.D., Chen, L.F.: Clustering ensembles based classification method for imbalanced data sets. PR&AI 23(6), 772–780 (2010)
Du, H.L.: Algorithm for imbalanced dataset based on K-Nearest Neighbor in kernel space. Chin. J. Front. Comput. Sci. Technol. 9(7), 869–876 (2015)
Xia, Z.G., Xia, S.X., Cai, S.Y.: Semi-supervised Gaussian process classification algorithm addressing the class imbalance. J. Commun. 34(5), 42–51 (2013)
Cao, P., Zhao, D., Zaiane, O.: An optimized cost-sensitive SVM for imbalanced data learning. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 280–292. Springer, Heidelberg (2013)
Acknowledgement
This work was supported by the Shaanxi Provincial Natural Science Foundation (Grant No. 2014JM2-6122), Shanxi Provincial Education Department scientific research program funded projects (Grant No. 15JK1218) and the Science and Technology Foundation of Shangluo University (Grant No. 15SKY010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Du, H., Teng, S., Zhang, L., Zhang, Y. (2016). Support Vector Machine Based on Dynamic Density Equalization. In: Zu, Q., Hu, B. (eds) Human Centered Computing. HCC 2016. Lecture Notes in Computer Science(), vol 9567. Springer, Cham. https://doi.org/10.1007/978-3-319-31854-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-31854-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31853-0
Online ISBN: 978-3-319-31854-7
eBook Packages: Computer ScienceComputer Science (R0)