Asymmetric kernel-based robust classification by ADMM

Ding, Guangsheng; Yang, Liming

doi:10.1007/s10115-022-01758-6

Asymmetric kernel-based robust classification by ADMM

Regular Paper
Published: 19 September 2022

Volume 65, pages 89–110, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Guangsheng Ding¹ &
Liming Yang¹

280 Accesses
1 Altmetric
Explore all metrics

Abstract

Correntropy is a locally second-order statistical measure in kernel space. The different kernel functions induce different correntropy with different properties. In this work, we propose an asymmetric mixture kernel and the corresponding correntropy. Then, we propose a new loss function (called RCH-loss) that is induced by the correntropy with the reproducing asymmetric kernel. Some important properties of the proposed kernel and RCH-loss are demonstrated such as non-convexity, boundedness, asymmetry and asymptotic approximation. Moreover, the proposed RCH-loss satisfies Bayes optimal decision rule. With the RCH-loss function, a new robust classification framework is built to handle robust classification. Theoretically, we prove the generalization bound of the proposed model based on the Rademacher complexity. Following that, DC (difference of convex functions) programming algorithm (DCA) is developed to solve the problem iteratively, where ADMM (alternating direction method of multipliers) is used to solve the subproblem at each iteration. Moreover, we analyze the computation complexity of the algorithm and the sensitivity of parameters. Numerical experimentations are carried out on various datasets including benchmark data sets and artificial data sets with different noise levels. The experimental results display the feasibility and effectiveness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Kernel Approximation for Classification

Robust semi-supervised support vector machines with Laplace kernel-induced correntropy loss functions

Article 05 September 2020

Robust kernel-based multiclass support vector machines via second-order cone programming

Article 06 January 2017

Notes

https://archive.ics.uci.edu/ml/index.php.

References

Vapnik VN, Vladimir N (2002) The nature of statistical learning theory. IEEE Trans Neural Net 8(6):1564
Google Scholar
Alam S, Sonbhadra S K, Agarwal S, et al (2020) One-class support vector classifiers: a survey.Knowledge-Based Systems 196,105754
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article Google Scholar
Suykens JAK, Brabanter JD, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105
Article MATH Google Scholar
Yang B, Shao QM, Pan L et al (2018) A study on regularized weighted least square support vector classifier. Pattern Recogn Lett 108:48–55
Article Google Scholar
Liu D , Shi Y , Tian Y , et al (2016) Ramp loss least squares support vector machine. J Comput Sci S1877750316300096
Shen X, Niu L, Qi Z et al (2017) Support vector machine classifier with truncated pinball loss. Pattern Recogn 68:199–210
Article Google Scholar
Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recogn 47(1):441–453
Article MATH Google Scholar
Xu G, Cao Z, Hu BG et al (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recogn 63:139–148
Article MATH Google Scholar
Ren Z, Yang L (2018) Correntropy-based robust extreme learning machine for classification. Neurocomputing 313:74–84
Article Google Scholar
Yuan Ch, Yang L, Sun P (2021) Correntropy-based metric for robust twin support vector machine. Inf Sci 545:82–101
Article MathSciNet MATH Google Scholar
Liu W, Pokharel PP, Principe JC (2007) Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans Signal Process 55(11):5286–5298
Article MathSciNet MATH Google Scholar
Du B, Tang X, Wang Z et al (2018) Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion. IEEE Trans Cybernetics 99:1–14
Google Scholar
Feng Y, Huang X, Shi L et al (2015) Learning with the maximum correntropy criterion induced losses for regression. J Mach Learn Res 16(1):993–1034
MathSciNet MATH Google Scholar
Liming Y, Hongwei D (2019) Robust support vector machine with generalized quantile loss for classification and regression. Appl Soft Comput J 81:105483
Article Google Scholar
Chen B, Wang X, Lu N, Wang S, Cao J, Qin J (2018) Mixture correntropy for robust learning. Pattern Recogn 79:318–327
Article Google Scholar
Yidan W, Liming Y (2019) A robust classification framework with mixture correntropy. Inf Sci 491:306–318
Article MathSciNet MATH Google Scholar
Yang L, Sheng D, Chao Y, Min Z (2020) Robust regression framework with asymmetrically analogous to correntropy-induced loss. Knowl-Based Syst 191:105211
Article Google Scholar
Oliveira WD (2019) Proximal bundle methods for nonsmooth DC programming. J Global Optim 75(2):523–563
Article MathSciNet MATH Google Scholar
Majzoobi L, Lahouti F, Shah-Mansouri V (2019) Analysis of distributed ADMM algorithm for consensus optimization in presence of node error. IEEE Trans Signal Process 67(7):1
Article MathSciNet MATH Google Scholar
Boyd S, Parikh N, Chu E et al (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations & Trends Machine Learning 3(1):1–122
Article MATH Google Scholar
Geng FZ, Qian SP (2014) Piecewise reproducing kernel method for singularly perturbed delay initial value problems. Appl Math Lett 37(11):67–71
Article MathSciNet MATH Google Scholar
Stephen Boyd LV (2006) Convex optimization. IEEE Trans Autom Control 51(11):1859
Article Google Scholar
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends Machine Learning 3:1–122
Article MATH Google Scholar
Hajinezhad D, Shi Q (2018) Alternating direction method of multipliers for a class of nonconvex bilinear optimization: convergence analysis and applications. J Global Optim 70(1):261–288
Article MathSciNet MATH Google Scholar
Hao D A , Sq A , Yw A , et al (2021) Asymptotic properties on high-dimensional multivariate regression M-estimation - ScienceDirect. J Multivariate Anal
Duc KT, Chiogna M, Adimari G (2020) Nonparametric estimation of ROC surfaces under verification bias. Revstat- Stat J 18(5):697–720
MathSciNet MATH Google Scholar
Huang X, Shi L, Suykens JAK (2014) Asymmetric least squares support vector machine classifiers. Comput Stat & Data Anal 70(2):395–405
Article MathSciNet MATH Google Scholar
Lin Y (2004) A note on margin-based loss functions in classification. Stat & Probab Lett 68(1):73–82
Article MathSciNet MATH Google Scholar
Bartlett PL, Mendelson S (2003) Rademacher and gaussian complexities: risk bounds and structural results. J Machine Learning 3:463–482
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported by National Nature Science Foundation of China (11471010, 11271367). Moreover, the authors thank the referees and the editors. Their suggestions improved the paper significantly.

Author information

Authors and Affiliations

College of Science, China Agricultural University, Beijing, 100083, China
Guangsheng Ding & Liming Yang

Authors

Guangsheng Ding
View author publications
You can also search for this author inPubMed Google Scholar
Liming Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Liming Yang.

Ethics declarations

Conflict of interest

No conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix. DC programming and DCA

We outline the main algorithmic results for DC programming [18, 19]. The key to DC programs is to decompose an objective function into the difference of two convex functions, from which a sequence of approximations of the objective function yields a sequence of solutions converging to a stationary point, possibly an optimal solution. Generally speaking, a so-called DC program is to minimize a DC function:

$$\begin{aligned} f(x)=g(x)-h(x),x\in R^{n} \end{aligned}$$

(41)

with g(x) and h(x) being convex functions.

The DCA is an iterative algorithm based on local optimality conditions and duality [14-17]. The main scheme of DCA is: (for simplify, we omit here the dual part), at each iteration, one replaces in the primal DC problem ($P_{dc}$) the second component h by its affine minorization: $h(x^{k})+(x-x^{k})^{T}y^{k}$, to generate the convex program:

$$\begin{aligned} minimize: \{g(x)-h(x^{k})-(x-x^{k})^{T}y^{k},x\in R^{n},y^{k}\in \partial h(x^{k}) \} \end{aligned}$$

(42)

where $ \partial h$ is the subdifferential of convex function h. In practice, a simplified form of the DCA is used. Two sequences $\{x^{k}\}$ and $\{y^{k}\}$ satisfying $ y^{k}\in \partial h(x^{k})$ are constructed, and $x^{k+1}$ is a solution to the convex program (42). The simplified DCA is described as follows.

Initialization: Choose an initial point $x^{0}\in R^{n}$ and let $k=0$

Repeat

Calculate $y^{k}\in \partial h(x^k)$

Solve convex program (42) to obtain $x^{k+1} $

Let k:=k+1

Until some stopping criterion is satisfied.

DCA is a descent method without line search, and it converges linearly for general DC programs.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ding, G., Yang, L. Asymmetric kernel-based robust classification by ADMM. Knowl Inf Syst 65, 89–110 (2023). https://doi.org/10.1007/s10115-022-01758-6

Download citation

Received: 18 July 2021
Revised: 29 August 2022
Accepted: 03 September 2022
Published: 19 September 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10115-022-01758-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymmetric kernel-based robust classification by ADMM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Kernel Approximation for Classification

Robust semi-supervised support vector machines with Laplace kernel-induced correntropy loss functions

Robust kernel-based multiclass support vector machines via second-order cone programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix. DC programming and DCA

Appendix. DC programming and DCA

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now