Skip to main content

Advertisement

Log in

A non-convex robust small sphere and large margin support vector machine for imbalanced data classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Small sphere and large margin support vector machine (SSLM) is an effective method for imbalanced data classification. However, the hinge loss used in SSLM easily leads to sensitivity to the noises and thus yields poor generalization performance since the outliers gain the largest penalties. In this paper, we propose a Ramp loss small sphere and large margin support vector machine (Ramp SSLM) for imbalanced data classification to improve the performance of SSLM. In comparison with SSLM, our model can incorporate noises, has less support vectors and thus owns better scaling properties. The non-convexity of Ramp SSLM can be efficiently solved by the concave-convex procedure (CCCP), which contains a sequence of convex problems. Furthermore, a sequential minimal optimization (SMO) decomposition method is employed to deal with the large-scale datasets. Experiments on an artificial, ten benchmark datasets and Chinese wine dataset are conducted and evaluation metrics such as g-means and \(F_1\) score are adopted. Our method achieves better performance than other state-of-the-art algorithms, which shows the stability and effectiveness of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The ten benchmark datasets generated during and/or analyzed during the current study are available in the UCI machine learning repository [http://archive.ics. uci.edu/ml/datasets.html] and kaggle [https://www.kaggle.com]. And the Chinese wine dataset is available from the corresponding author on reasonable request.

Notes

  1. http://archive.ics. uci.edu/ml/datasets.html.

  2. https://www.kaggle.com.

References

  1. Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  2. Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167

    Article  Google Scholar 

  3. Najafzadeh M, Noori R, Afroozi D et al (2021) A comprehensive uncertainty analysis of model-estimated longitudinal and lateral dispersion coefficients in open channels. J Hydrol 603:126850

    Article  Google Scholar 

  4. Noori R, Ghiasi B, Salehi S et al (2022) An efficient data driven-based model for prediction of the total sediment load in rivers. Hydrology 9(2):36

    Article  Google Scholar 

  5. Zhang W, Yoshida T, Tang X (2008) Text classification based on multi-word with support vector machine. Knowledge-Based Syst 21(8):879–886

    Article  Google Scholar 

  6. Osuna E, Freund R, Girosit F (1997) Training support vector machines: an application to face detection. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp. 130-136

  7. Schölkopf B, Williamson R, Smola A, Shawe-Taylor J, Platt J (1999) Support vector method for novelty detection. In NIPS 12:582–588

    Google Scholar 

  8. Jayadeva Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29:905–910

    Article  MATH  Google Scholar 

  9. Tian Y, Qi Z, Ju X, Shi Y, Liu X (2013) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079

    Article  Google Scholar 

  10. Xu Y (2017) Maximum margin of twin spheres support vector machine for imbalanced data classification. IEEE Trans Syst Man Cybern 47(6):1540–1550

    Google Scholar 

  11. Xu Y, Yang Z, Zhang Y, Pan X, Wang L (2016) A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowledge-Based Syst 95:75–85

    Article  Google Scholar 

  12. Xu Y, Wang Q, Pang X, Tian Y (2018) Maximum margin of twin spheres machine with pinball loss for imbalanced data classification. Appl Intell 48(1):23–34

    Article  Google Scholar 

  13. Sabzekar M, Yazdi H, Naghibzadeh M (2012) Relaxed constraints support vector machine. Expert Syst 29(5):506–525

    Article  Google Scholar 

  14. Sabzekar M, Aydin Z (2021) A noise-aware feature selection approach for classification. Soft Computing 25(8):6391–6400

    Article  Google Scholar 

  15. Tax D, Duin R (2004) Support vector data description. Mach Learn 54(1):45–66

    Article  MATH  Google Scholar 

  16. Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092

    Article  Google Scholar 

  17. Sabzekar M, Namakin M, Babaki H, Deldari A, Babaiyan V (2021) Dental implants success prediction by classifier ensemble on imbalanced data. Comput Methods Programs Biomed Update 1:100021

    Article  Google Scholar 

  18. Tang Y, Zhang Y, Chawla N, Krasser S (2008) SVMs modeling for highly imbalanced classification. IEEE Trans Syst Man Cybern Part B (Cybernetics) 39(1):281–288

    Article  Google Scholar 

  19. Makki S, Assaghir Z, Taher Y, Haque R, Hacid M, Zeineddine H (2019) An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access 7:93010–93022

    Article  Google Scholar 

  20. Lin E, Chen Q, Qi X (2020) Deep reinforcement learning for imbalanced classification. Appl Intell 50(8):2488–2502

    Article  Google Scholar 

  21. Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983

    Article  MATH  Google Scholar 

  22. Suykens J, Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105

    Article  MATH  Google Scholar 

  23. Lin C, Wang S (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471

    Article  Google Scholar 

  24. Wang K, Cao J, Pei H (2020) Robust extreme learning machine in the presence of outliers by iterative reweighted algorithm. Appl Math Comput 377:125186

    MATH  Google Scholar 

  25. Xiao Y, Wang H, Xu W (2017) Ramp loss based robust one-class SVM. Pattern Recognit Lett 85:15–20

    Article  Google Scholar 

  26. Liu D, Shi Y, Tian Y (2015) Ramp loss nonparallel support vector machine for pattern classification. Knowledge-Based Syst 85:224–233

    Article  Google Scholar 

  27. Tian Y, Mirzabagheri M, Bamakan S, Wang H, Qu Q (2018) Ramp loss one-class support vector machine; a robust and effective approach to anomaly detection problems. Neurocomputing 310:223–235

    Article  Google Scholar 

  28. Brooks J (2011) Support vector machines with the ramp loss and the hard margin loss. Oper Res 59(2):467–479

    Article  MATH  Google Scholar 

  29. Wang Q, Xu Y (2019) Concave-convex programming for ramp loss-based maximum margin and minimum volume twin spheres machine. Neural Process Lett 50(2):1093–1114

    Article  Google Scholar 

  30. Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936

    Article  MATH  Google Scholar 

  31. Yuille A, Rangarajan A (2001) The concave-convex procedure (CCCP). In: Advances in neural information processing systems 14, pp 1033–1040

  32. Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  33. Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983

    Article  MATH  Google Scholar 

  34. Huang X, Shi L, Suykens J (2013) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997

    Article  Google Scholar 

  35. Huang X, Shi L, Suykens J (2014) Ramp loss linear programming support vector machine. J Mach Learn Res 15(1):2185–2211

    MATH  Google Scholar 

  36. Tao P, An L (1998) A DC optimization algorithm for solving the trust-region subproblem. SIAM J Optim 8(2):476–505

    Article  MATH  Google Scholar 

  37. Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. In: Proceedings of the 23rd international conference on Machine learning, pp. 201-208

  38. Chang C, Lin C (2001) Training \(v\)-support vector classifiers: theory and algorithms. Neural Comput 13(9):2119–2147

    Article  MATH  Google Scholar 

  39. Fan R, Chen P, Lin C, Joachims T (2005) Working set selection using second order information for training support vector machines. J Mach Learn Res 6(4):1889–1918

    MATH  Google Scholar 

  40. Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649

    Article  MATH  Google Scholar 

  41. Lu S, Wang H, Zhou Z (2019) All-in-one multicategory Ramp loss maximum margin of twin spheres support vector machine. Appl Intell 49(6):2301–2314

    Article  Google Scholar 

  42. Wang H, Xu Y, Zhou Z (2021) Twin-parametric margin support vector machine with truncated pinball loss. Neural Comput Appl 33(8):3781–3798

    Article  Google Scholar 

  43. Xu Y, Zhang Y, Zhao J, Yang Z, Pan X (2019) KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification. Int J Mach Learn Cybern 10(2):357–368

    Article  Google Scholar 

  44. An R, Xu Y, Liu X (2021) A rough margin-based multi-task v-twin support vector machine for pattern classification. Appl Soft Comput 112:107769

    Article  Google Scholar 

  45. Tian Y, Mirzabagheri M, Bamakan S, Wang H, Qu Q (2018) Ramp loss one-class support vector machine; a robust and effective approach to anomaly detection problems. Neurocomputing 310:223–235

    Article  Google Scholar 

  46. Xu Y, Yang Z, Pan X (2016) A novel twin support-vector machine with pinball loss. IEEE Trans Neural Netw Learn Syst 28(2):359–370

    Article  Google Scholar 

  47. Xu Y, Guo R, Wang L (2013) A twin multi-class classification support vector machine. Cognit Comput 5(4):580–588

    Article  Google Scholar 

  48. Wang H, Xu Y, Chen Q, Wang X (2021) Diagnosis of complications of type 2 diabetes based on weighted multi-label small sphere and large margin machine. Appl Intell 51(1):223–236

    Article  Google Scholar 

  49. Tang J, Li D, Tian Y, Liu D (2018) Multi-view learning based on nonparallel support vector machine. Knowledge-Based Syst 158:94–108

    Article  Google Scholar 

  50. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MATH  Google Scholar 

  51. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inform Sci 180(10):2044–2064

    Article  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the helpful comments of the reviewers, which have improved the presentation. This work was supported in part by the National Natural Science Foundation of China (No. 12071475, 11671010) and Beijing Natural Science Foundation (No.4172035).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yitian Xu.

Ethics declarations

Conflicts of interest

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Xu, Y. A non-convex robust small sphere and large margin support vector machine for imbalanced data classification. Neural Comput & Applic 35, 3245–3261 (2023). https://doi.org/10.1007/s00521-022-07882-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07882-2

Keywords