Skip to main content
Log in

Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled \(\alpha \)-Hinge Loss with Non-smooth Regularizer

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

As support vector machines (SVM) are used extensively in machine learning applications, it becomes essential to obtain a sparse model that is also robust to noise in the data set. Although many researchers have presented different approaches to get a robust SVM, the work on robust SVM based on rescaled hinge loss function (RSVM-RHHQ) has attracted a great deal of attention. The method of using correntropy with hinge loss function has added a noticeable amount of robustness to the model. However, the sparsity of the model can be further improved. In this work, we focus on enhancing the sparsity of RSVM-RHHQ. As this work is the improved version of the RSVM-RHHQ, we follow the same track (of adding noise in the data) of RSVM-RHHQ with altogether a new problem formulation. We apply correntropy to the \(\alpha \)-hinge loss function, which results in a better loss function than the rescaled hinge loss function. We use a non-smooth regularizer with a non-convex and non-smooth loss function. We solve this non-smooth and non-convex problem using the primal–dual proximal method. We find that this combination not only adds sparsity to the model, but it is also better than the existing robust SVM methods in terms of robustness towards label noise. We also provide the convergence proof of the proposed approach. In addition, the time complexity of the optimization technique is included. We perform experiments over various publicly available real-world data sets to compare the proposed method with the existing robust SVM methods. For experimentation purposes, we use small data sets, large data sets, and also data sets with significant class imbalance. Experimental results show that the proposed approach outperforms existing methods in sparseness, accuracy, and robustness. We also provide the sensitivity analysis of the regularization parameter for the label noise in the data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availibility

The source code for the implementation is available at https://github.com/manisha1427/Robustsvm1. For any clarification on implementation, readers are requested to contact the first author (M. Singla).

References

  1. Allgower EL, Georg K, Miranda R (1993) Exploiting symmetry in applied and numerical analysis: 1992 AMS-SIAM summer seminar in applied mathematics, July 26–August 1, 1992, Colorado State University, vol 29. American Mathematical Society, Providence

  2. Barron JT (2017) A more general robust loss function. arXiv preprint arXiv:1701.03077

  3. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202

    Article  MathSciNet  MATH  Google Scholar 

  4. Best MJ (1996) An algorithm for the solution of the parametric quadratic programming problem. In: Fischer H, Riedmüller B, Schäffler S (eds) Applied mathematics and parallel computing. Springer, Berlin, pp 57–76

    Chapter  Google Scholar 

  5. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167

    Article  Google Scholar 

  6. Chivers I, Sleightholme J (2015) An introduction to algorithms and the big o notation. In: Chivers I (ed) Introduction to programming with Fortran. Springer, Berlin, pp 359–364

    Chapter  MATH  Google Scholar 

  7. Das A, Panda R, Roy-Chowdhury AK (2017) Continuous adaptation of multi-camera person identification models through sparse non-redundant representative selection. Comput Vis Image Underst 156:66–78

    Article  Google Scholar 

  8. Fan M, Zhang X, Du L, Chen L, Tao D (2017) Semi-supervised learning through label propagation on geodesics. IEEE Trans Cybern 48(5):1486–1499

    Article  Google Scholar 

  9. Gal T (2010) Postoptimal analyses, parametric programming, and related topics: degeneracy, multicriteria decision making, redundancy. Walter de Gruyter, Berlin

    Google Scholar 

  10. Gong R, Wu C, Chu M (2018) Steel surface defect classification using multiple hyper-spheres support vector machine with additional information. Chemom Intell Lab Syst 172:109–117

    Article  Google Scholar 

  11. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc., CA, USA, pp 1024–1034

  12. Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-sampling: training robust networks for extremely noisy supervision. arXiv preprint arXiv:1804.06872

  13. Hillermeier C (2001) Nonlinear multiobjective optimization: a generalized homotopy approach, vol 135. Springer, Berlin

    Book  MATH  Google Scholar 

  14. Hou Q, Liu L, Zhen L, Jing L (2018) A novel projection nonparallel support vector machine for pattern classification. Eng Appl Artif Intell 75:64–75

    Article  Google Scholar 

  15. Huang LW, Shao YH, Zhang J, Zhao YT, Teng JY (2019) Robust rescaled hinge loss twin support vector machine for imbalanced noisy classification. IEEE Access 7:65390–65404

    Article  Google Scholar 

  16. Huang X, Shi L, Suykens JA (2014) Ramp loss linear programming support vector machine. J Mach Learn Res 15(1):2185–2211

    MathSciNet  MATH  Google Scholar 

  17. Huang X, Shi L, Suykens JA (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997

    Article  Google Scholar 

  18. Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  MATH  Google Scholar 

  19. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv Preprint arXiv:1609.02907

  20. Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461

    Article  Google Scholar 

  21. Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) \(p\)-Laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940

    Google Scholar 

  22. Ma X, Liu W, Li S, Tao D, Zhou Y (2018) Hypergraph \( p \)-Laplacian regularization for remotely sensed image recognition. IEEE Trans Geosci Remote Sens 57(3):1585–1595

    Google Scholar 

  23. Ma Y, Li L, Huang X, Wang S (2011) Robust support vector machine using least median loss penalty. IFAC Proc Vol 44(1):11208–11213

    Article  Google Scholar 

  24. Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Nevada, United States, pp 1196–1204

  25. Nikolova M, Ng MK (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966

    Article  MathSciNet  MATH  Google Scholar 

  26. Ritter K (1981) On parametric linear and quadratic programming problems. Tech. rep., Wisconsin Univ-Madison Mathematics Research Center

  27. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80

    Article  Google Scholar 

  28. Shen X, Niu L, Qi Z, Tian Y (2017) Support vector machine classifier with truncated pinball loss. Pattern Recognit 68:199–210

    Article  Google Scholar 

  29. Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recognit 47(1):441–453

    Article  MATH  Google Scholar 

  30. Singla M, Shukla KK (2019) Robust statistics-based support vector machine and its variants: a survey. Neural Comput Appl 32:11173–11194

  31. Singla M, Ghosh D, Shukla KK (2019) A survey of robust optimization based machine learning with special reference to support vector machines. Int J Mach Learn Cybern 11:1359–1385

  32. Singla M, Ghosh D, Shukla K, Pedrycz W (2020) Robust twin support vector regression based on rescaled hinge loss. Pattern Recognit 105:107395

  33. Song Q, Hu W, Xie W (2002) Robust support vector machine with bullet hole image classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 32(4):440–448

    Article  Google Scholar 

  34. Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105

    Article  MATH  Google Scholar 

  35. Suzumura S, Ogawa K, Sugiyama M, Takeuchi I (2014) Outlier path: a homotopy algorithm for robust SVM. In: International conference on machine learning, pp 1098–1106

  36. Tian Y, Qi Z, Ju X, Shi Y, Liu X (2013) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079

    Article  Google Scholar 

  37. Van Rooyen B, Menon A, Williamson RC (2015) Learning with symmetric label noise: the importance of being unhinged. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., Montreal, QC, Canada, pp 10–18

  38. Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780

    Google Scholar 

  39. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999

    Article  Google Scholar 

  40. Wang CD, Lai J (2013) Position regularized support vector domain description. Pattern Recognit 46(3):875–884

    Article  Google Scholar 

  41. Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227

    Article  Google Scholar 

  42. Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983

    Article  MathSciNet  MATH  Google Scholar 

  43. Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst

  44. Xing HJ, Ji M (2018) Robust one-class support vector machine with rescaled hinge loss function. Pattern Recognit 84:152–164

    Article  Google Scholar 

  45. Xu G, Cao Z, Hu BG, Principe JC (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recognit 63:139–148

    Article  MATH  Google Scholar 

  46. Xu L, Crammer K, Schuurmans D (2006) Robust support vector machine training via convex outlier ablation. AAAI 6:536–542

    Google Scholar 

  47. Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemom Intell Lab Syst 177:89–99

    Article  Google Scholar 

  48. Yang T, Mahdavi M, Jin R, Zhu S (2015) An efficient primal dual prox method for non-smooth optimization. Mach Learn 98(3):369–406

    Article  MathSciNet  MATH  Google Scholar 

  49. Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976

    Article  Google Scholar 

  50. Yang X, Tan L, He L (2014) A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140:41–52

    Article  Google Scholar 

  51. Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442

    Article  Google Scholar 

Download references

Acknowledgements

The authors truly appreciate comments and suggestions by anonymous reviewers that have resulted substantial increase in the quality of the paper. The first author would like to acknowledge a research fellowship from the IIT (BHU) Varanasi.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Debdas Ghosh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singla, M., Ghosh, D. & Shukla, K.K. Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled \(\alpha \)-Hinge Loss with Non-smooth Regularizer. Neural Process Lett 52, 2211–2239 (2020). https://doi.org/10.1007/s11063-020-10346-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10346-0

Keywords

Navigation