Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled $$\alpha $$ -Hinge Loss with Non-smooth Regularizer

Singla, Manisha; Ghosh, Debdas; Shukla, K. K.

doi:10.1007/s11063-020-10346-0

Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled $\alpha $-Hinge Loss with Non-smooth Regularizer

Published: 15 September 2020

Volume 52, pages 2211–2239, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

394 Accesses
2 Citations
Explore all metrics

Abstract

As support vector machines (SVM) are used extensively in machine learning applications, it becomes essential to obtain a sparse model that is also robust to noise in the data set. Although many researchers have presented different approaches to get a robust SVM, the work on robust SVM based on rescaled hinge loss function (RSVM-RHHQ) has attracted a great deal of attention. The method of using correntropy with hinge loss function has added a noticeable amount of robustness to the model. However, the sparsity of the model can be further improved. In this work, we focus on enhancing the sparsity of RSVM-RHHQ. As this work is the improved version of the RSVM-RHHQ, we follow the same track (of adding noise in the data) of RSVM-RHHQ with altogether a new problem formulation. We apply correntropy to the $\alpha $-hinge loss function, which results in a better loss function than the rescaled hinge loss function. We use a non-smooth regularizer with a non-convex and non-smooth loss function. We solve this non-smooth and non-convex problem using the primal–dual proximal method. We find that this combination not only adds sparsity to the model, but it is also better than the existing robust SVM methods in terms of robustness towards label noise. We also provide the convergence proof of the proposed approach. In addition, the time complexity of the optimization technique is included. We perform experiments over various publicly available real-world data sets to compare the proposed method with the existing robust SVM methods. For experimentation purposes, we use small data sets, large data sets, and also data sets with significant class imbalance. Experimental results show that the proposed approach outperforms existing methods in sparseness, accuracy, and robustness. We also provide the sensitivity analysis of the regularization parameter for the label noise in the data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

Learning from positive and unlabeled data: a survey

Article 02 April 2020

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Data Availibility

The source code for the implementation is available at https://github.com/manisha1427/Robustsvm1. For any clarification on implementation, readers are requested to contact the first author (M. Singla).

References

Allgower EL, Georg K, Miranda R (1993) Exploiting symmetry in applied and numerical analysis: 1992 AMS-SIAM summer seminar in applied mathematics, July 26–August 1, 1992, Colorado State University, vol 29. American Mathematical Society, Providence
Barron JT (2017) A more general robust loss function. arXiv preprint arXiv:1701.03077
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Article MathSciNet MATH Google Scholar
Best MJ (1996) An algorithm for the solution of the parametric quadratic programming problem. In: Fischer H, Riedmüller B, Schäffler S (eds) Applied mathematics and parallel computing. Springer, Berlin, pp 57–76
Chapter Google Scholar
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Article Google Scholar
Chivers I, Sleightholme J (2015) An introduction to algorithms and the big o notation. In: Chivers I (ed) Introduction to programming with Fortran. Springer, Berlin, pp 359–364
Chapter MATH Google Scholar
Das A, Panda R, Roy-Chowdhury AK (2017) Continuous adaptation of multi-camera person identification models through sparse non-redundant representative selection. Comput Vis Image Underst 156:66–78
Article Google Scholar
Fan M, Zhang X, Du L, Chen L, Tao D (2017) Semi-supervised learning through label propagation on geodesics. IEEE Trans Cybern 48(5):1486–1499
Article Google Scholar
Gal T (2010) Postoptimal analyses, parametric programming, and related topics: degeneracy, multicriteria decision making, redundancy. Walter de Gruyter, Berlin
Google Scholar
Gong R, Wu C, Chu M (2018) Steel surface defect classification using multiple hyper-spheres support vector machine with additional information. Chemom Intell Lab Syst 172:109–117
Article Google Scholar
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc., CA, USA, pp 1024–1034
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-sampling: training robust networks for extremely noisy supervision. arXiv preprint arXiv:1804.06872
Hillermeier C (2001) Nonlinear multiobjective optimization: a generalized homotopy approach, vol 135. Springer, Berlin
Book MATH Google Scholar
Hou Q, Liu L, Zhen L, Jing L (2018) A novel projection nonparallel support vector machine for pattern classification. Eng Appl Artif Intell 75:64–75
Article Google Scholar
Huang LW, Shao YH, Zhang J, Zhao YT, Teng JY (2019) Robust rescaled hinge loss twin support vector machine for imbalanced noisy classification. IEEE Access 7:65390–65404
Article Google Scholar
Huang X, Shi L, Suykens JA (2014) Ramp loss linear programming support vector machine. J Mach Learn Res 15(1):2185–2211
MathSciNet MATH Google Scholar
Huang X, Shi L, Suykens JA (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997
Article Google Scholar
Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Article MATH Google Scholar
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv Preprint arXiv:1609.02907
Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461
Article Google Scholar
Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) $p$-Laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940
Google Scholar
Ma X, Liu W, Li S, Tao D, Zhou Y (2018) Hypergraph $ p $-Laplacian regularization for remotely sensed image recognition. IEEE Trans Geosci Remote Sens 57(3):1585–1595
Google Scholar
Ma Y, Li L, Huang X, Wang S (2011) Robust support vector machine using least median loss penalty. IFAC Proc Vol 44(1):11208–11213
Article Google Scholar
Natarajan N, Dhillon IS, Ravikumar PK, Tewari A (2013) Learning with noisy labels. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., Nevada, United States, pp 1196–1204
Nikolova M, Ng MK (2005) Analysis of half-quadratic minimization methods for signal and image recovery. SIAM J Sci Comput 27(3):937–966
Article MathSciNet MATH Google Scholar
Ritter K (1981) On parametric linear and quadratic programming problems. Tech. rep., Wisconsin Univ-Madison Mathematics Research Center
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Article Google Scholar
Shen X, Niu L, Qi Z, Tian Y (2017) Support vector machine classifier with truncated pinball loss. Pattern Recognit 68:199–210
Article Google Scholar
Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recognit 47(1):441–453
Article MATH Google Scholar
Singla M, Shukla KK (2019) Robust statistics-based support vector machine and its variants: a survey. Neural Comput Appl 32:11173–11194
Singla M, Ghosh D, Shukla KK (2019) A survey of robust optimization based machine learning with special reference to support vector machines. Int J Mach Learn Cybern 11:1359–1385
Singla M, Ghosh D, Shukla K, Pedrycz W (2020) Robust twin support vector regression based on rescaled hinge loss. Pattern Recognit 105:107395
Song Q, Hu W, Xie W (2002) Robust support vector machine with bullet hole image classification. IEEE Trans Syst Man Cybern Part C (Appl Rev) 32(4):440–448
Article Google Scholar
Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105
Article MATH Google Scholar
Suzumura S, Ogawa K, Sugiyama M, Takeuchi I (2014) Outlier path: a homotopy algorithm for robust SVM. In: International conference on machine learning, pp 1098–1106
Tian Y, Qi Z, Ju X, Shi Y, Liu X (2013) Nonparallel support vector machines for pattern classification. IEEE Trans Cybern 44(7):1067–1079
Article Google Scholar
Van Rooyen B, Menon A, Williamson RC (2015) Learning with symmetric label noise: the importance of being unhinged. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., Montreal, QC, Canada, pp 10–18
Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780
Google Scholar
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999
Article Google Scholar
Wang CD, Lai J (2013) Position regularized support vector domain description. Pattern Recognit 46(3):875–884
Article Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Article Google Scholar
Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983
Article MathSciNet MATH Google Scholar
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst
Xing HJ, Ji M (2018) Robust one-class support vector machine with rescaled hinge loss function. Pattern Recognit 84:152–164
Article Google Scholar
Xu G, Cao Z, Hu BG, Principe JC (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recognit 63:139–148
Article MATH Google Scholar
Xu L, Crammer K, Schuurmans D (2006) Robust support vector machine training via convex outlier ablation. AAAI 6:536–542
Google Scholar
Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemom Intell Lab Syst 177:89–99
Article Google Scholar
Yang T, Mahdavi M, Jin R, Zhu S (2015) An efficient primal dual prox method for non-smooth optimization. Mach Learn 98(3):369–406
Article MathSciNet MATH Google Scholar
Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976
Article Google Scholar
Yang X, Tan L, He L (2014) A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140:41–52
Article Google Scholar
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442
Article Google Scholar

Download references

Acknowledgements

The authors truly appreciate comments and suggestions by anonymous reviewers that have resulted substantial increase in the quality of the paper. The first author would like to acknowledge a research fellowship from the IIT (BHU) Varanasi.

Author information

Authors and Affiliations

Computer Science and Engineering Department, Indian Institute of Technology (Banaras Hindu University), Varanasi, Uttar Pradesh, 221005, India
Manisha Singla & K. K. Shukla
Department of Mathematical Sciences, Indian Institute of Technology (Banaras Hindu University), Varanasi, Uttar Pradesh, 221005, India
Debdas Ghosh

Authors

Manisha Singla
View author publications
You can also search for this author in PubMed Google Scholar
Debdas Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
K. K. Shukla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Debdas Ghosh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singla, M., Ghosh, D. & Shukla, K.K. Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled $\alpha $-Hinge Loss with Non-smooth Regularizer. Neural Process Lett 52, 2211–2239 (2020). https://doi.org/10.1007/s11063-020-10346-0

Download citation

Accepted: 01 September 2020
Published: 15 September 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11063-020-10346-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled \(\alpha \)-Hinge Loss with Non-smooth Regularizer

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

Learning from positive and unlabeled data: a survey

Learning from imbalanced data: open challenges and future directions

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improved Sparsity of Support Vector Machine with Robustness Towards Label Noise Based on Rescaled \(\alpha \)-Hinge Loss with Non-smooth Regularizer

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

Learning from positive and unlabeled data: a survey

Learning from imbalanced data: open challenges and future directions

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation