Skip to main content
Log in

Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Recently a robust and efficient classifier termed affinity and class probability-based fuzzy support vector machine (ACFSVM) was proposed to address the binary class imbalance and noisy data classification problems. Despite the excellent generalization ability of ACFSVM, there is a scope to improve its classification ability. To enhance the classification performance of ACFSVM, this work suggests a novel fuzzy twin support vector machine based on affinity and class probability (ACFTSVM). In ACFTSVM, regularization terms are added to the primal problems which diminish the negative influence of noise. The affinity (AF) of the majority (MJ) class datapoints is measured using the support vector data description model trained in kernel space using only the MJ class training samples. The k-nearest neighbour method is used to estimate the class probability (CP) of the MJ class datapoints in the same kernel space as before to decrease the potential of noises. Lower CP samples are prone to noise, and their contribution to learning appears to be harmed by their low memberships, which are calculated by adding the AFs and the CPs. ACFTSVM, like ACFSVM, will give preference to MJ class datapoints with higher AFs and CPs, while minimizing the influence of minority class samples with lower AFs and CPs. As a result, the decision boundary is skewed towards the MJ class. Five artificially imbalanced datasets and a few notable real-world datasets are used in numerical simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The data source is specified in the manuscript.

References

  1. Trafalis TB, Alwazzi SA (2007) Support vector regression with noisy data: a second order cone programming approach. Int J Gen Syst 36(2):237–250

    Article  MathSciNet  MATH  Google Scholar 

  2. Pant R, Trafalis TB, Barker K (2011) Support vector machine classification of uncertain and imbalanced data using robust optimization. In: Proceedings of the 15th WSEAS international conference on computers. World Scientific and Engineering Academy and Society (WSEAS) Stevens Point, Wisconsin, USA pp 369–374

  3. Özmen A, Kropat E, Weber GW (2017) Robust optimization in spline regression models for multi-model regulatory networks under polyhedral uncertainty. Optimization 66(12):2135–2155

    Article  MathSciNet  MATH  Google Scholar 

  4. Kara G, Özmen A, Weber GW (2019) Stability advances in robust portfolio optimization under parallelepiped uncertainty. CEJOR 27(1):241–261

    Article  MathSciNet  MATH  Google Scholar 

  5. Parsons S, Hunter A (1998) A review of uncertainty handling formalisms. In: Hunter A, Parsons S (eds) Applications of uncertainty formalisms. Lecture Notes in Computer Science, vol 1455. Springer, Berlin, Heidelberg

  6. Yazdi M, Zarei E (2018) Uncertainty handling in the safety risk analysis: an integrated approach based on fuzzy fault tree analysis. J Fail Anal Prev 18(2):392–404

    Article  Google Scholar 

  7. Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719

    Article  Google Scholar 

  8. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  9. Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571

    Article  Google Scholar 

  10. Ali A, Shamsuddin SM, Ralescu AL (2013) Classification with class imbalance problem. Int J Adv Soft Comput Appl 5(3):176–204

    Google Scholar 

  11. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  12. Zhao L, Shang Z, Tan J, Zhou M, Zhang M, Gu D, Tang YY (2022) Siamese networks with an online reweighted example for imbalanced data learning. Pattern Recognit 132:108947

    Article  Google Scholar 

  13. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  14. Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemom Intell Lab Syst 177:89–99

    Article  Google Scholar 

  15. Sethy PK and Behera SK (2020) Detection of Coronavirus Disease (COVID-19) based on deep features. Preprints.org. 2020030300. https://doi.org/10.20944/preprints202003.0300.v1

  16. Gupta D, Hazarika BB, Berlin M, Sharma UM, Mishra K (2021) Artificial intelligence for suspended sediment load prediction: a review. Environ Earth Sci 80(9):1–39

    Article  Google Scholar 

  17. Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  Google Scholar 

  18. Chen SG, Wu XJ (2018) A new fuzzy twin support vector machine for pattern classification. Int J Mach Learn Cybern 9(9):1553–1564

    Article  Google Scholar 

  19. Sun A, Lim EP, Liu Y (2009) On strategies for imbalanced text classification using SVM: a comparative study. Decis Support Syst 48(1):191–201

    Article  Google Scholar 

  20. Shao YH, Zhang CH, Wang XB, Deng NY (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

    Article  Google Scholar 

  21. Kumar B, Gupta D (2021) Universum based lagrangian twin bounded support vector machine to classify EEG signals. Comput Methods Programs Biomed 208:106244

    Article  Google Scholar 

  22. Prasad SC, Balasundaram S (2021) On Lagrangian L2-norm pinball twin bounded support vector machine via unconstrained convex minimization. Inf Sci 571:279–302

    Article  MathSciNet  Google Scholar 

  23. Gupta U, Gupta D (2019) Lagrangian twin-bounded support vector machine based on L2-norm. Recent developments in machine learning and data analytics. Springer, Singapore, pp 431–444

    Chapter  Google Scholar 

  24. Borah P, Gupta D (2021) Robust twin bounded support vector machines for outliers and imbalanced data. Appl Intell 51:1–30

    Article  Google Scholar 

  25. Khemchandani R, Jayadeva, Chandra S (2008) Fuzzy twin support vector machines for pattern classification. In: Mathematical programming and game theory for decision making, pp 131–142

  26. Rezvani S, Wang X, Pourpanah F (2019) Intuitionistic fuzzy twin support vector machines. IEEE Trans Fuzzy Syst 27(11):2140–2151

    Article  Google Scholar 

  27. Hazarika BB, Gupta D (2021) Density-weighted support vector machines for binary class imbalance learning. Neural Comput Appl 33(9):4243–4261

    Article  Google Scholar 

  28. Tao X, Li Q, Ren C, Guo W, He Q, Liu R, Zou J (2020) Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw 122:289–307

    Article  Google Scholar 

  29. Ding S, Yu J, Qi B, Huang H (2014) An overview on twin support vector machines. Artif Intell Rev 42(2):245–252

    Article  Google Scholar 

  30. Wang L, Gao C, Zhao N, Chen X (2019) A projection wavelet weighted twin support vector regression and its primal solution. Appl Intell 49:1–21

    Article  Google Scholar 

  31. Ding S, Zhang N, Zhang X, Wu F (2017) Twin support vector machine: theory, algorithm and applications. Neural Comput Appl 28(11):3119–3130

    Article  Google Scholar 

  32. Tax DM, Duin RP (2004) Support vector data description. Mach Learn 54(1):45–66

    Article  MATH  Google Scholar 

  33. Mosek APS (2015) The MOSEK optimization toolbox for MATLAB manual. Version 7.1 (Revision 28). http://mosek.com

  34. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc Ser B (Methodol) 36(2):111–133

    MathSciNet  MATH  Google Scholar 

  35. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17:255–287

    Google Scholar 

  36. Dua D, and Graff C (2019) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science, zuletzt abgerufen am: 14.09. 2019. Google Scholar

  37. Woolson RF (2007) Wilcoxon signed‐rank test. Wiley encyclopedia of clinical trials, 1–3

  38. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  39. Hazarika BB, Gupta D, Borah P (2021) An intuitionistic fuzzy kernel ridge regression classifier for binary classification. Appl Soft Comput 112:107816

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Gupta.

Ethics declarations

Conflict of interest

Authors state that they have no potential conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 549 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hazarika, B.B., Gupta, D. & Borah, P. Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning. Knowl Inf Syst 65, 5259–5288 (2023). https://doi.org/10.1007/s10115-023-01904-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01904-8

Keywords

Navigation