Abstract
As support vector machines (SVM) are used extensively in machine learning applications, it becomes essential to obtain a sparse model that is also robust to noise in the data set. Although many researchers have presented different approaches to get a robust SVM, the work on robust SVM based on rescaled hinge loss function (RSVM-RHHQ) has attracted a great deal of attention. The method of using correntropy with hinge loss function has added a noticeable amount of robustness to the model. However, the sparsity of the model can be further improved. In this work, we focus on enhancing the sparsity of RSVM-RHHQ. As this work is the improved version of the RSVM-RHHQ, we follow the same track (of adding noise in the data) of RSVM-RHHQ with altogether a new problem formulation. We apply correntropy to the \(\alpha \)-hinge loss function, which results in a better loss function than the rescaled hinge loss function. We use a non-smooth regularizer with a non-convex and non-smooth loss function. We solve this non-smooth and non-convex problem using the primal–dual proximal method. We find that this combination not only adds sparsity to the model, but it is also better than the existing robust SVM methods in terms of robustness towards label noise. We also provide the convergence proof of the proposed approach. In addition, the time complexity of the optimization technique is included. We perform experiments over various publicly available real-world data sets to compare the proposed method with the existing robust SVM methods. For experimentation purposes, we use small data sets, large data sets, and also data sets with significant class imbalance. Experimental results show that the proposed approach outperforms existing methods in sparseness, accuracy, and robustness. We also provide the sensitivity analysis of the regularization parameter for the label noise in the data set.
Similar content being viewed by others
Data Availibility
The source code for the implementation is available at https://github.com/manisha1427/Robustsvm1. For any clarification on implementation, readers are requested to contact the first author (M. Singla).
References
Allgower EL, Georg K, Miranda R (1993) Exploiting symmetry in applied and numerical analysis: 1992 AMS-SIAM summer seminar in applied mathematics, July 26–August 1, 1992, Colorado State University, vol 29. American Mathematical Society, Providence
Barron JT (2017) A more general robust loss function. arXiv preprint arXiv:1701.03077
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Best MJ (1996) An algorithm for the solution of the parametric quadratic programming problem. In: Fischer H, Riedmüller B, Schäffler S (eds) Applied mathematics and parallel computing. Springer, Berlin, pp 57–76
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Chivers I, Sleightholme J (2015) An introduction to algorithms and the big o notation. In: Chivers I (ed) Introduction to programming with Fortran. Springer, Berlin, pp 359–364
Das A, Panda R, Roy-Chowdhury AK (2017) Continuous adaptation of multi-camera person identification models through sparse non-redundant representative selection. Comput Vis Image Underst 156:66–78
Fan M, Zhang X, Du L, Chen L, Tao D (2017) Semi-supervised learning through label propagation on geodesics. IEEE Trans Cybern 48(5):1486–1499
Gal T (2010) Postoptimal analyses, parametric programming, and related topics: degeneracy, multicriteria decision making, redundancy. Walter de Gruyter, Berlin
Gong R, Wu C, Chu M (2018) Steel surface defect classification using multiple hyper-spheres support vector machine with additional information. Chemom Intell Lab Syst 172:109–117
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc., CA, USA, pp 1024–1034
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-sampling: training robust networks for extremely noisy supervision. arXiv preprint arXiv:1804.06872
Hillermeier C (2001) Nonlinear multiobjective optimization: a generalized homotopy approach, vol 135. Springer, Berlin
Hou Q, Liu L, Zhen L, Jing L (2018) A novel projection nonparallel support vector machine for pattern classification. Eng Appl Artif Intell 75:64–75
Huang LW, Shao YH, Zhang J, Zhao YT, Teng JY (2019) Robust rescaled hinge loss twin support vector machine for imbalanced noisy classification. IEEE Access 7:65390–65404
Huang X, Shi L, Suykens JA (2014) Ramp loss linear programming support vector machine. J Mach Learn Res 15(1):2185–2211
Huang X, Shi L, Suykens JA (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997
Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv Preprint arXiv:1609.02907
Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461
Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) \(p\)-Laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940
Ma X, Liu W, Li S, Tao D, Zhou Y (2018) Hypergraph \( p \)-Laplacian regularization for remotely sensed image recognition. IEEE Trans Geosci Remote Sens 57(3):1585–1595
Ma Y, Li L, Huang X, Wang S (2011) Robust support vector machine using least median loss penalty. IFAC Proc Vol 44(1):11208–11213
Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Nevada, United States, pp 1196–1204
Nikolova M, Ng MK (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966
Ritter K (1981) On parametric linear and quadratic programming problems. Tech. rep., Wisconsin Univ-Madison Mathematics Research Center
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Shen X, Niu L, Qi Z, Tian Y (2017) Support vector machine classifier with truncated pinball loss. Pattern Recognit 68:199–210
Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recognit 47(1):441–453
Singla M, Shukla KK (2019) Robust statistics-based support vector machine and its variants: a survey. Neural Comput Appl 32:11173–11194
Singla M, Ghosh D, Shukla KK (2019) A survey of robust optimization based machine learning with special reference to support vector machines. Int J Mach Learn Cybern 11:1359–1385
Singla M, Ghosh D, Shukla K, Pedrycz W (2020) Robust twin support vector regression based on rescaled hinge loss. Pattern Recognit 105:107395
Song Q, Hu W, Xie W (2002) Robust support vector machine with bullet hole image classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 32(4):440–448
Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105
Suzumura S, Ogawa K, Sugiyama M, Takeuchi I (2014) Outlier path: a homotopy algorithm for robust SVM. In: International conference on machine learning, pp 1098–1106
Tian Y, Qi Z, Ju X, Shi Y, Liu X (2013) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079
Van Rooyen B, Menon A, Williamson RC (2015) Learning with symmetric label noise: the importance of being unhinged. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., Montreal, QC, Canada, pp 10–18
Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Wang CD, Lai J (2013) Position regularized support vector domain description. Pattern Recognit 46(3):875–884
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst
Xing HJ, Ji M (2018) Robust one-class support vector machine with rescaled hinge loss function. Pattern Recognit 84:152–164
Xu G, Cao Z, Hu BG, Principe JC (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recognit 63:139–148
Xu L, Crammer K, Schuurmans D (2006) Robust support vector machine training via convex outlier ablation. AAAI 6:536–542
Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemom Intell Lab Syst 177:89–99
Yang T, Mahdavi M, Jin R, Zhu S (2015) An efficient primal dual prox method for non-smooth optimization. Mach Learn 98(3):369–406
Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976
Yang X, Tan L, He L (2014) A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140:41–52
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442
Acknowledgements
The authors truly appreciate comments and suggestions by anonymous reviewers that have resulted substantial increase in the quality of the paper. The first author would like to acknowledge a research fellowship from the IIT (BHU) Varanasi.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Singla, M., Ghosh, D. & Shukla, K.K. Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled \(\alpha \)-Hinge Loss with Non-smooth Regularizer. Neural Process Lett 52, 2211–2239 (2020). https://doi.org/10.1007/s11063-020-10346-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10346-0