Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine

Pruengkarn, Ratchakoon; Wong, Kok Wai; Fung, Chun Che

doi:10.1007/978-3-319-70139-4_7

Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine

Ratchakoon Pruengkarn¹⁸,
Kok Wai Wong¹⁸ &
Chun Che Fung¹⁸

Conference paper
First Online: 29 October 2017

4696 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10638))

Abstract

A hybrid sampling technique is proposed by combining Fuzzy C-Mean Clustering and Synthetic Minority Oversampling Technique (FCMSMT) for tackling the imbalanced multiclass classification problem. The mean number of classes is used as the number of instances for applying undersampling and oversampling. Using the mean as the fixed number of the required instances for each class can prevent the within-class imbalance data from being eliminated erroneously during undersampling. This technique can decrease both within-class and between-class errors, and thus can increase the classification performance. The study was conducted using eight benchmark datasets from KEEL and UCI repositories and the results were compared against three major classifiers based on G-mean and AUC measurements. The results reveal that the proposed technique could handle most of the multiclass imbalanced datasets used in the experiments for all classifiers and retain the integrity of the original data.

This is a preview of subscription content, log in via an institution.

References

López, V., Fernández, A., Herrera, F.: On the importance of the validation technique for classification with imbalanced datasets: addressing covariate shift when data is skewed. Inf. Sci. 257, 1–13 (2014)
Article Google Scholar
Agrawal, A., Viktor, H.L., Paquet, E.: SCUT: multi-class imbalanced data classification using SMOTE and cluster-based undersampling. In: 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3 K), pp. 226–234. Lisbon (2015)
Google Scholar
Jeatrakul, P., Wong, K.W., Fung, C.C.: Classification of imbalanced data by combining the complementary neural network and SMOTE algorithm. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010. LNCS, vol. 6444, pp. 152–159. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17534-3_19
Chapter Google Scholar
Ou, G., Murphey, Y.L.: Multi-class pattern classification using neural networks. Pattern Recogn. 40(1), 4–18 (2007)
Article MATH Google Scholar
Fernández, A., del Jesus, M.J., Herrera, F.: Multi-class imbalanced data-sets with linguistic fuzzy rule based classification systems based on pairwise learning. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 89–98. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14049-5_10
Chapter Google Scholar
Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 1119–1130 (2012)
Article Google Scholar
Rahman, M., Davis, D.N.: Addressing the class imbalance problem in medical datasets. Int. J. Mach. Learn. Comput. 3(2), 224–228 (2013)
Article Google Scholar
Lin, W.C., Tsai, C.F., Hu, Y.H., Jhang, J.S.: Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409–410, 17–26 (2017)
Article Google Scholar
Kocyigit, Y., Seker, H.: Imbalanced data classifier by using ensemble fuzzy c-means clustering. In: The IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI 2012), pp. 952–955. Hong Kong (2012)
Google Scholar
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, P.: SMOTE: synthetic minority over-sampling technique. Artif. Intell. Res. 16(1), 321–357 (2002)
MATH Google Scholar
Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 28(1), 238–251 (2016)
Article Google Scholar
Jian, C., Gao, J., Ao, Y.: A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 193(1), 115–122 (2016)
Article Google Scholar
KEEL Data-Mining Software Tool: Data Set Repository. http://sci2s.ugr.es/keel/imbalanced.php. Accessed 30 May 2017
Lichman, M.: UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. Accessed 30 May 2017
Dumitru, C., Maria, V.: Advantages and disadvantages of using neural networks for predictions. Ovidius University Ann. Econ. Sci. Ser. 13(1), 444–449 (2013)
Google Scholar
Karamizadeh, S., Abdullah, S.M., Halimi, M., Shayan, J., Rajabi, M.J.: Advantage and drawback of support vector machine functionality. In: International Conference on Computer, Communications, and Control Technology (I4CT 2014), pp. 63–65. Langkawi (2014)
Google Scholar
Batuwita, R., Palade, V.: FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans. Fuzzy Syst. 18(3), 558–571 (2010)
Article Google Scholar
Pruengkarn, R., Wong, K.W., Fung, C.C.: Data cleaning using complementary fuzzy support vector machine technique. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9948, pp. 160–167. Springer, Cham (2016). doi:10.1007/978-3-319-46672-9_19
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Information Technology, Murdoch University, Perth, Australia
Ratchakoon Pruengkarn, Kok Wai Wong & Chun Che Fung

Authors

Ratchakoon Pruengkarn
View author publications
You can also search for this author in PubMed Google Scholar
Kok Wai Wong
View author publications
You can also search for this author in PubMed Google Scholar
Chun Che Fung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ratchakoon Pruengkarn .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pruengkarn, R., Wong, K.W., Fung, C.C. (2017). Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10638. Springer, Cham. https://doi.org/10.1007/978-3-319-70139-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-70139-4_7
Published: 29 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70138-7
Online ISBN: 978-3-319-70139-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics