Abstract
Correntropy is a locally second-order statistical measure in kernel space. The different kernel functions induce different correntropy with different properties. In this work, we propose an asymmetric mixture kernel and the corresponding correntropy. Then, we propose a new loss function (called RCH-loss) that is induced by the correntropy with the reproducing asymmetric kernel. Some important properties of the proposed kernel and RCH-loss are demonstrated such as non-convexity, boundedness, asymmetry and asymptotic approximation. Moreover, the proposed RCH-loss satisfies Bayes optimal decision rule. With the RCH-loss function, a new robust classification framework is built to handle robust classification. Theoretically, we prove the generalization bound of the proposed model based on the Rademacher complexity. Following that, DC (difference of convex functions) programming algorithm (DCA) is developed to solve the problem iteratively, where ADMM (alternating direction method of multipliers) is used to solve the subproblem at each iteration. Moreover, we analyze the computation complexity of the algorithm and the sensitivity of parameters. Numerical experimentations are carried out on various datasets including benchmark data sets and artificial data sets with different noise levels. The experimental results display the feasibility and effectiveness of the proposed methods.










Similar content being viewed by others
References
Vapnik VN, Vladimir N (2002) The nature of statistical learning theory. IEEE Trans Neural Net 8(6):1564
Alam S, Sonbhadra S K, Agarwal S, et al (2020) One-class support vector classifiers: a survey.Knowledge-Based Systems 196,105754
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Suykens JAK, Brabanter JD, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105
Yang B, Shao QM, Pan L et al (2018) A study on regularized weighted least square support vector classifier. Pattern Recogn Lett 108:48–55
Liu D , Shi Y , Tian Y , et al (2016) Ramp loss least squares support vector machine. J Comput Sci S1877750316300096
Shen X, Niu L, Qi Z et al (2017) Support vector machine classifier with truncated pinball loss. Pattern Recogn 68:199–210
Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recogn 47(1):441–453
Xu G, Cao Z, Hu BG et al (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recogn 63:139–148
Ren Z, Yang L (2018) Correntropy-based robust extreme learning machine for classification. Neurocomputing 313:74–84
Yuan Ch, Yang L, Sun P (2021) Correntropy-based metric for robust twin support vector machine. Inf Sci 545:82–101
Liu W, Pokharel PP, Principe JC (2007) Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans Signal Process 55(11):5286–5298
Du B, Tang X, Wang Z et al (2018) Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion. IEEE Trans Cybernetics 99:1–14
Feng Y, Huang X, Shi L et al (2015) Learning with the maximum correntropy criterion induced losses for regression. J Mach Learn Res 16(1):993–1034
Liming Y, Hongwei D (2019) Robust support vector machine with generalized quantile loss for classification and regression. Appl Soft Comput J 81:105483
Chen B, Wang X, Lu N, Wang S, Cao J, Qin J (2018) Mixture correntropy for robust learning. Pattern Recogn 79:318–327
Yidan W, Liming Y (2019) A robust classification framework with mixture correntropy. Inf Sci 491:306–318
Yang L, Sheng D, Chao Y, Min Z (2020) Robust regression framework with asymmetrically analogous to correntropy-induced loss. Knowl-Based Syst 191:105211
Oliveira WD (2019) Proximal bundle methods for nonsmooth DC programming. J Global Optim 75(2):523–563
Majzoobi L, Lahouti F, Shah-Mansouri V (2019) Analysis of distributed ADMM algorithm for consensus optimization in presence of node error. IEEE Trans Signal Process 67(7):1
Boyd S, Parikh N, Chu E et al (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations & Trends Machine Learning 3(1):1–122
Geng FZ, Qian SP (2014) Piecewise reproducing kernel method for singularly perturbed delay initial value problems. Appl Math Lett 37(11):67–71
Stephen Boyd LV (2006) Convex optimization. IEEE Trans Autom Control 51(11):1859
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends Machine Learning 3:1–122
Hajinezhad D, Shi Q (2018) Alternating direction method of multipliers for a class of nonconvex bilinear optimization: convergence analysis and applications. J Global Optim 70(1):261–288
Hao D A , Sq A , Yw A , et al (2021) Asymptotic properties on high-dimensional multivariate regression M-estimation - ScienceDirect. J Multivariate Anal
Duc KT, Chiogna M, Adimari G (2020) Nonparametric estimation of ROC surfaces under verification bias. Revstat- Stat J 18(5):697–720
Huang X, Shi L, Suykens JAK (2014) Asymmetric least squares support vector machine classifiers. Comput Stat & Data Anal 70(2):395–405
Lin Y (2004) A note on margin-based loss functions in classification. Stat & Probab Lett 68(1):73–82
Bartlett PL, Mendelson S (2003) Rademacher and gaussian complexities: risk bounds and structural results. J Machine Learning 3:463–482
Acknowledgements
This work is supported by National Nature Science Foundation of China (11471010, 11271367). Moreover, the authors thank the referees and the editors. Their suggestions improved the paper significantly.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix. DC programming and DCA
Appendix. DC programming and DCA
We outline the main algorithmic results for DC programming [18, 19]. The key to DC programs is to decompose an objective function into the difference of two convex functions, from which a sequence of approximations of the objective function yields a sequence of solutions converging to a stationary point, possibly an optimal solution. Generally speaking, a so-called DC program is to minimize a DC function:
with g(x) and h(x) being convex functions.
The DCA is an iterative algorithm based on local optimality conditions and duality [14-17]. The main scheme of DCA is: (for simplify, we omit here the dual part), at each iteration, one replaces in the primal DC problem (\(P_{dc}\)) the second component h by its affine minorization: \(h(x^{k})+(x-x^{k})^{T}y^{k}\), to generate the convex program:
where \( \partial h\) is the subdifferential of convex function h. In practice, a simplified form of the DCA is used. Two sequences \(\{x^{k}\}\) and \(\{y^{k}\}\) satisfying \( y^{k}\in \partial h(x^{k})\) are constructed, and \(x^{k+1}\) is a solution to the convex program (42). The simplified DCA is described as follows.
Initialization: Choose an initial point \(x^{0}\in R^{n}\) and let \(k=0\)
Repeat
Calculate \(y^{k}\in \partial h(x^k)\)
Solve convex program (42) to obtain \(x^{k+1} \)
Let k:=k+1
Until some stopping criterion is satisfied.
DCA is a descent method without line search, and it converges linearly for general DC programs.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, G., Yang, L. Asymmetric kernel-based robust classification by ADMM. Knowl Inf Syst 65, 89–110 (2023). https://doi.org/10.1007/s10115-022-01758-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01758-6