Abstract
Recently a robust and efficient classifier termed affinity and class probability-based fuzzy support vector machine (ACFSVM) was proposed to address the binary class imbalance and noisy data classification problems. Despite the excellent generalization ability of ACFSVM, there is a scope to improve its classification ability. To enhance the classification performance of ACFSVM, this work suggests a novel fuzzy twin support vector machine based on affinity and class probability (ACFTSVM). In ACFTSVM, regularization terms are added to the primal problems which diminish the negative influence of noise. The affinity (AF) of the majority (MJ) class datapoints is measured using the support vector data description model trained in kernel space using only the MJ class training samples. The k-nearest neighbour method is used to estimate the class probability (CP) of the MJ class datapoints in the same kernel space as before to decrease the potential of noises. Lower CP samples are prone to noise, and their contribution to learning appears to be harmed by their low memberships, which are calculated by adding the AFs and the CPs. ACFTSVM, like ACFSVM, will give preference to MJ class datapoints with higher AFs and CPs, while minimizing the influence of minority class samples with lower AFs and CPs. As a result, the decision boundary is skewed towards the MJ class. Five artificially imbalanced datasets and a few notable real-world datasets are used in numerical simulations.
Similar content being viewed by others
Data availability
The data source is specified in the manuscript.
References
Trafalis TB, Alwazzi SA (2007) Support vector regression with noisy data: a second order cone programming approach. Int J Gen Syst 36(2):237–250
Pant R, Trafalis TB, Barker K (2011) Support vector machine classification of uncertain and imbalanced data using robust optimization. In: Proceedings of the 15th WSEAS international conference on computers. World Scientific and Engineering Academy and Society (WSEAS) Stevens Point, Wisconsin, USA pp 369–374
Özmen A, Kropat E, Weber GW (2017) Robust optimization in spline regression models for multi-model regulatory networks under polyhedral uncertainty. Optimization 66(12):2135–2155
Kara G, Özmen A, Weber GW (2019) Stability advances in robust portfolio optimization under parallelepiped uncertainty. CEJOR 27(1):241–261
Parsons S, Hunter A (1998) A review of uncertainty handling formalisms. In: Hunter A, Parsons S (eds) Applications of uncertainty formalisms. Lecture Notes in Computer Science, vol 1455. Springer, Berlin, Heidelberg
Yazdi M, Zarei E (2018) Uncertainty handling in the safety risk analysis: an integrated approach based on fuzzy fault tree analysis. J Fail Anal Prev 18(2):392–404
Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571
Ali A, Shamsuddin SM, Ralescu AL (2013) Classification with class imbalance problem. Int J Adv Soft Comput Appl 5(3):176–204
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Zhao L, Shang Z, Tan J, Zhou M, Zhang M, Gu D, Tang YY (2022) Siamese networks with an online reweighted example for imbalanced data learning. Pattern Recognit 132:108947
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemom Intell Lab Syst 177:89–99
Sethy PK and Behera SK (2020) Detection of Coronavirus Disease (COVID-19) based on deep features. Preprints.org. 2020030300. https://doi.org/10.20944/preprints202003.0300.v1
Gupta D, Hazarika BB, Berlin M, Sharma UM, Mishra K (2021) Artificial intelligence for suspended sediment load prediction: a review. Environ Earth Sci 80(9):1–39
Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Chen SG, Wu XJ (2018) A new fuzzy twin support vector machine for pattern classification. Int J Mach Learn Cybern 9(9):1553–1564
Sun A, Lim EP, Liu Y (2009) On strategies for imbalanced text classification using SVM: a comparative study. Decis Support Syst 48(1):191–201
Shao YH, Zhang CH, Wang XB, Deng NY (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968
Kumar B, Gupta D (2021) Universum based lagrangian twin bounded support vector machine to classify EEG signals. Comput Methods Programs Biomed 208:106244
Prasad SC, Balasundaram S (2021) On Lagrangian L2-norm pinball twin bounded support vector machine via unconstrained convex minimization. Inf Sci 571:279–302
Gupta U, Gupta D (2019) Lagrangian twin-bounded support vector machine based on L2-norm. Recent developments in machine learning and data analytics. Springer, Singapore, pp 431–444
Borah P, Gupta D (2021) Robust twin bounded support vector machines for outliers and imbalanced data. Appl Intell 51:1–30
Khemchandani R, Jayadeva, Chandra S (2008) Fuzzy twin support vector machines for pattern classification. In: Mathematical programming and game theory for decision making, pp 131–142
Rezvani S, Wang X, Pourpanah F (2019) Intuitionistic fuzzy twin support vector machines. IEEE Trans Fuzzy Syst 27(11):2140–2151
Hazarika BB, Gupta D (2021) Density-weighted support vector machines for binary class imbalance learning. Neural Comput Appl 33(9):4243–4261
Tao X, Li Q, Ren C, Guo W, He Q, Liu R, Zou J (2020) Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw 122:289–307
Ding S, Yu J, Qi B, Huang H (2014) An overview on twin support vector machines. Artif Intell Rev 42(2):245–252
Wang L, Gao C, Zhao N, Chen X (2019) A projection wavelet weighted twin support vector regression and its primal solution. Appl Intell 49:1–21
Ding S, Zhang N, Zhang X, Wu F (2017) Twin support vector machine: theory, algorithm and applications. Neural Comput Appl 28(11):3119–3130
Tax DM, Duin RP (2004) Support vector data description. Mach Learn 54(1):45–66
Mosek APS (2015) The MOSEK optimization toolbox for MATLAB manual. Version 7.1 (Revision 28). http://mosek.com
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc Ser B (Methodol) 36(2):111–133
Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17:255–287
Dua D, and Graff C (2019) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science, zuletzt abgerufen am: 14.09. 2019. Google Scholar
Woolson RF (2007) Wilcoxon signed‐rank test. Wiley encyclopedia of clinical trials, 1–3
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Hazarika BB, Gupta D, Borah P (2021) An intuitionistic fuzzy kernel ridge regression classifier for binary classification. Appl Soft Comput 112:107816
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors state that they have no potential conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hazarika, B.B., Gupta, D. & Borah, P. Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning. Knowl Inf Syst 65, 5259–5288 (2023). https://doi.org/10.1007/s10115-023-01904-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01904-8