Skip to main content
Log in

Asymmetric kernel-based robust classification by ADMM

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Correntropy is a locally second-order statistical measure in kernel space. The different kernel functions induce different correntropy with different properties. In this work, we propose an asymmetric mixture kernel and the corresponding correntropy. Then, we propose a new loss function (called RCH-loss) that is induced by the correntropy with the reproducing asymmetric kernel. Some important properties of the proposed kernel and RCH-loss are demonstrated such as non-convexity, boundedness, asymmetry and asymptotic approximation. Moreover, the proposed RCH-loss satisfies Bayes optimal decision rule. With the RCH-loss function, a new robust classification framework is built to handle robust classification. Theoretically, we prove the generalization bound of the proposed model based on the Rademacher complexity. Following that, DC (difference of convex functions) programming algorithm (DCA) is developed to solve the problem iteratively, where ADMM (alternating direction method of multipliers) is used to solve the subproblem at each iteration. Moreover, we analyze the computation complexity of the algorithm and the sensitivity of parameters. Numerical experimentations are carried out on various datasets including benchmark data sets and artificial data sets with different noise levels. The experimental results display the feasibility and effectiveness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://archive.ics.uci.edu/ml/index.php.

References

  1. Vapnik VN, Vladimir N (2002) The nature of statistical learning theory. IEEE Trans Neural Net 8(6):1564

    Google Scholar 

  2. Alam S, Sonbhadra S K, Agarwal S, et al (2020) One-class support vector classifiers: a survey.Knowledge-Based Systems 196,105754

  3. Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300

    Article  Google Scholar 

  4. Suykens JAK, Brabanter JD, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105

    Article  MATH  Google Scholar 

  5. Yang B, Shao QM, Pan L et al (2018) A study on regularized weighted least square support vector classifier. Pattern Recogn Lett 108:48–55

    Article  Google Scholar 

  6. Liu D , Shi Y , Tian Y , et al (2016) Ramp loss least squares support vector machine. J Comput Sci S1877750316300096

  7. Shen X, Niu L, Qi Z et al (2017) Support vector machine classifier with truncated pinball loss. Pattern Recogn 68:199–210

    Article  Google Scholar 

  8. Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recogn 47(1):441–453

    Article  MATH  Google Scholar 

  9. Xu G, Cao Z, Hu BG et al (2017) Robust support vector machines based on the rescaled hinge loss function. Pattern Recogn 63:139–148

    Article  MATH  Google Scholar 

  10. Ren Z, Yang L (2018) Correntropy-based robust extreme learning machine for classification. Neurocomputing 313:74–84

    Article  Google Scholar 

  11. Yuan Ch, Yang L, Sun P (2021) Correntropy-based metric for robust twin support vector machine. Inf Sci 545:82–101

    Article  MathSciNet  MATH  Google Scholar 

  12. Liu W, Pokharel PP, Principe JC (2007) Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans Signal Process 55(11):5286–5298

    Article  MathSciNet  MATH  Google Scholar 

  13. Du B, Tang X, Wang Z et al (2018) Robust graph-based semisupervised learning for noisy labeled data via maximum correntropy criterion. IEEE Trans Cybernetics 99:1–14

    Google Scholar 

  14. Feng Y, Huang X, Shi L et al (2015) Learning with the maximum correntropy criterion induced losses for regression. J Mach Learn Res 16(1):993–1034

    MathSciNet  MATH  Google Scholar 

  15. Liming Y, Hongwei D (2019) Robust support vector machine with generalized quantile loss for classification and regression. Appl Soft Comput J 81:105483

    Article  Google Scholar 

  16. Chen B, Wang X, Lu N, Wang S, Cao J, Qin J (2018) Mixture correntropy for robust learning. Pattern Recogn 79:318–327

    Article  Google Scholar 

  17. Yidan W, Liming Y (2019) A robust classification framework with mixture correntropy. Inf Sci 491:306–318

    Article  MathSciNet  MATH  Google Scholar 

  18. Yang L, Sheng D, Chao Y, Min Z (2020) Robust regression framework with asymmetrically analogous to correntropy-induced loss. Knowl-Based Syst 191:105211

    Article  Google Scholar 

  19. Oliveira WD (2019) Proximal bundle methods for nonsmooth DC programming. J Global Optim 75(2):523–563

    Article  MathSciNet  MATH  Google Scholar 

  20. Majzoobi L, Lahouti F, Shah-Mansouri V (2019) Analysis of distributed ADMM algorithm for consensus optimization in presence of node error. IEEE Trans Signal Process 67(7):1

    Article  MathSciNet  MATH  Google Scholar 

  21. Boyd S, Parikh N, Chu E et al (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations & Trends Machine Learning 3(1):1–122

    Article  MATH  Google Scholar 

  22. Geng FZ, Qian SP (2014) Piecewise reproducing kernel method for singularly perturbed delay initial value problems. Appl Math Lett 37(11):67–71

    Article  MathSciNet  MATH  Google Scholar 

  23. Stephen Boyd LV (2006) Convex optimization. IEEE Trans Autom Control 51(11):1859

    Article  Google Scholar 

  24. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends Machine Learning 3:1–122

    Article  MATH  Google Scholar 

  25. Hajinezhad D, Shi Q (2018) Alternating direction method of multipliers for a class of nonconvex bilinear optimization: convergence analysis and applications. J Global Optim 70(1):261–288

    Article  MathSciNet  MATH  Google Scholar 

  26. Hao D A , Sq A , Yw A , et al (2021) Asymptotic properties on high-dimensional multivariate regression M-estimation - ScienceDirect. J Multivariate Anal

  27. Duc KT, Chiogna M, Adimari G (2020) Nonparametric estimation of ROC surfaces under verification bias. Revstat- Stat J 18(5):697–720

    MathSciNet  MATH  Google Scholar 

  28. Huang X, Shi L, Suykens JAK (2014) Asymmetric least squares support vector machine classifiers. Comput Stat & Data Anal 70(2):395–405

    Article  MathSciNet  MATH  Google Scholar 

  29. Lin Y (2004) A note on margin-based loss functions in classification. Stat & Probab Lett 68(1):73–82

    Article  MathSciNet  MATH  Google Scholar 

  30. Bartlett PL, Mendelson S (2003) Rademacher and gaussian complexities: risk bounds and structural results. J Machine Learning 3:463–482

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by National Nature Science Foundation of China (11471010, 11271367). Moreover, the authors thank the referees and the editors. Their suggestions improved the paper significantly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liming Yang.

Ethics declarations

Conflict of interest

No conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix. DC programming and DCA

Appendix. DC programming and DCA

We outline the main algorithmic results for DC programming [18, 19]. The key to DC programs is to decompose an objective function into the difference of two convex functions, from which a sequence of approximations of the objective function yields a sequence of solutions converging to a stationary point, possibly an optimal solution. Generally speaking, a so-called DC program is to minimize a DC function:

$$\begin{aligned} f(x)=g(x)-h(x),x\in R^{n} \end{aligned}$$
(41)

with g(x) and h(x) being convex functions.

The DCA is an iterative algorithm based on local optimality conditions and duality [14-17]. The main scheme of DCA is: (for simplify, we omit here the dual part), at each iteration, one replaces in the primal DC problem (\(P_{dc}\)) the second component h by its affine minorization: \(h(x^{k})+(x-x^{k})^{T}y^{k}\), to generate the convex program:

$$\begin{aligned} minimize: \{g(x)-h(x^{k})-(x-x^{k})^{T}y^{k},x\in R^{n},y^{k}\in \partial h(x^{k}) \} \end{aligned}$$
(42)

where \( \partial h\) is the subdifferential of convex function h. In practice, a simplified form of the DCA is used. Two sequences \(\{x^{k}\}\) and \(\{y^{k}\}\) satisfying \( y^{k}\in \partial h(x^{k})\) are constructed, and \(x^{k+1}\) is a solution to the convex program (42). The simplified DCA is described as follows.

Initialization: Choose an initial point \(x^{0}\in R^{n}\) and let \(k=0\)

Repeat

Calculate \(y^{k}\in \partial h(x^k)\)

Solve convex program (42) to obtain \(x^{k+1} \)

Let k:=k+1

Until some stopping criterion is satisfied.

DCA is a descent method without line search, and it converges linearly for general DC programs.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, G., Yang, L. Asymmetric kernel-based robust classification by ADMM. Knowl Inf Syst 65, 89–110 (2023). https://doi.org/10.1007/s10115-022-01758-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01758-6

Keywords

Navigation