Abstract
Small sphere and large margin support vector machine (SSLM) is an effective method for imbalanced data classification. However, the hinge loss used in SSLM easily leads to sensitivity to the noises and thus yields poor generalization performance since the outliers gain the largest penalties. In this paper, we propose a Ramp loss small sphere and large margin support vector machine (Ramp SSLM) for imbalanced data classification to improve the performance of SSLM. In comparison with SSLM, our model can incorporate noises, has less support vectors and thus owns better scaling properties. The non-convexity of Ramp SSLM can be efficiently solved by the concave-convex procedure (CCCP), which contains a sequence of convex problems. Furthermore, a sequential minimal optimization (SMO) decomposition method is employed to deal with the large-scale datasets. Experiments on an artificial, ten benchmark datasets and Chinese wine dataset are conducted and evaluation metrics such as g-means and \(F_1\) score are adopted. Our method achieves better performance than other state-of-the-art algorithms, which shows the stability and effectiveness of our proposed algorithm.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The ten benchmark datasets generated during and/or analyzed during the current study are available in the UCI machine learning repository [http://archive.ics. uci.edu/ml/datasets.html] and kaggle [https://www.kaggle.com]. And the Chinese wine dataset is available from the corresponding author on reasonable request.
References
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Najafzadeh M, Noori R, Afroozi D et al (2021) A comprehensive uncertainty analysis of model-estimated longitudinal and lateral dispersion coefficients in open channels. J Hydrol 603:126850
Noori R, Ghiasi B, Salehi S et al (2022) An efficient data driven-based model for prediction of the total sediment load in rivers. Hydrology 9(2):36
Zhang W, Yoshida T, Tang X (2008) Text classification based on multi-word with support vector machine. Knowledge-Based Syst 21(8):879–886
Osuna E, Freund R, Girosit F (1997) Training support vector machines: an application to face detection. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp. 130-136
Schölkopf B, Williamson R, Smola A, Shawe-Taylor J, Platt J (1999) Support vector method for novelty detection. In NIPS 12:582–588
Jayadeva Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29:905–910
Tian Y, Qi Z, Ju X, Shi Y, Liu X (2013) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079
Xu Y (2017) Maximum margin of twin spheres support vector machine for imbalanced data classification. IEEE Trans Syst Man Cybern 47(6):1540–1550
Xu Y, Yang Z, Zhang Y, Pan X, Wang L (2016) A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowledge-Based Syst 95:75–85
Xu Y, Wang Q, Pang X, Tian Y (2018) Maximum margin of twin spheres machine with pinball loss for imbalanced data classification. Appl Intell 48(1):23–34
Sabzekar M, Yazdi H, Naghibzadeh M (2012) Relaxed constraints support vector machine. Expert Syst 29(5):506–525
Sabzekar M, Aydin Z (2021) A noise-aware feature selection approach for classification. Soft Computing 25(8):6391–6400
Tax D, Duin R (2004) Support vector data description. Mach Learn 54(1):45–66
Wu M, Ye J (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31(11):2088–2092
Sabzekar M, Namakin M, Babaki H, Deldari A, Babaiyan V (2021) Dental implants success prediction by classifier ensemble on imbalanced data. Comput Methods Programs Biomed Update 1:100021
Tang Y, Zhang Y, Chawla N, Krasser S (2008) SVMs modeling for highly imbalanced classification. IEEE Trans Syst Man Cybern Part B (Cybernetics) 39(1):281–288
Makki S, Assaghir Z, Taher Y, Haque R, Hacid M, Zeineddine H (2019) An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access 7:93010–93022
Lin E, Chen Q, Qi X (2020) Deep reinforcement learning for imbalanced classification. Appl Intell 50(8):2488–2502
Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983
Suykens J, Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105
Lin C, Wang S (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471
Wang K, Cao J, Pei H (2020) Robust extreme learning machine in the presence of outliers by iterative reweighted algorithm. Appl Math Comput 377:125186
Xiao Y, Wang H, Xu W (2017) Ramp loss based robust one-class SVM. Pattern Recognit Lett 85:15–20
Liu D, Shi Y, Tian Y (2015) Ramp loss nonparallel support vector machine for pattern classification. Knowledge-Based Syst 85:224–233
Tian Y, Mirzabagheri M, Bamakan S, Wang H, Qu Q (2018) Ramp loss one-class support vector machine; a robust and effective approach to anomaly detection problems. Neurocomputing 310:223–235
Brooks J (2011) Support vector machines with the ramp loss and the hard margin loss. Oper Res 59(2):467–479
Wang Q, Xu Y (2019) Concave-convex programming for ramp loss-based maximum margin and minimum volume twin spheres machine. Neural Process Lett 50(2):1093–1114
Yuille A, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936
Yuille A, Rangarajan A (2001) The concave-convex procedure (CCCP). In: Advances in neural information processing systems 14, pp 1033–1040
Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983
Huang X, Shi L, Suykens J (2013) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997
Huang X, Shi L, Suykens J (2014) Ramp loss linear programming support vector machine. J Mach Learn Res 15(1):2185–2211
Tao P, An L (1998) A DC optimization algorithm for solving the trust-region subproblem. SIAM J Optim 8(2):476–505
Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. In: Proceedings of the 23rd international conference on Machine learning, pp. 201-208
Chang C, Lin C (2001) Training \(v\)-support vector classifiers: theory and algorithms. Neural Comput 13(9):2119–2147
Fan R, Chen P, Lin C, Joachims T (2005) Working set selection using second order information for training support vector machines. J Mach Learn Res 6(4):1889–1918
Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649
Lu S, Wang H, Zhou Z (2019) All-in-one multicategory Ramp loss maximum margin of twin spheres support vector machine. Appl Intell 49(6):2301–2314
Wang H, Xu Y, Zhou Z (2021) Twin-parametric margin support vector machine with truncated pinball loss. Neural Comput Appl 33(8):3781–3798
Xu Y, Zhang Y, Zhao J, Yang Z, Pan X (2019) KNN-based maximum margin and minimum volume hyper-sphere machine for imbalanced data classification. Int J Mach Learn Cybern 10(2):357–368
An R, Xu Y, Liu X (2021) A rough margin-based multi-task v-twin support vector machine for pattern classification. Appl Soft Comput 112:107769
Tian Y, Mirzabagheri M, Bamakan S, Wang H, Qu Q (2018) Ramp loss one-class support vector machine; a robust and effective approach to anomaly detection problems. Neurocomputing 310:223–235
Xu Y, Yang Z, Pan X (2016) A novel twin support-vector machine with pinball loss. IEEE Trans Neural Netw Learn Syst 28(2):359–370
Xu Y, Guo R, Wang L (2013) A twin multi-class classification support vector machine. Cognit Comput 5(4):580–588
Wang H, Xu Y, Chen Q, Wang X (2021) Diagnosis of complications of type 2 diabetes based on weighted multi-label small sphere and large margin machine. Appl Intell 51(1):223–236
Tang J, Li D, Tian Y, Liu D (2018) Multi-view learning based on nonparallel support vector machine. Knowledge-Based Syst 158:94–108
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inform Sci 180(10):2044–2064
Acknowledgements
The authors gratefully acknowledge the helpful comments of the reviewers, which have improved the presentation. This work was supported in part by the National Natural Science Foundation of China (No. 12071475, 11671010) and Beijing Natural Science Foundation (No.4172035).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declared that they have no conflicts of interest to this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Xu, Y. A non-convex robust small sphere and large margin support vector machine for imbalanced data classification. Neural Comput & Applic 35, 3245–3261 (2023). https://doi.org/10.1007/s00521-022-07882-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07882-2