Skip to main content
Log in

Robust classification for skewed data

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In this paper we propose a robust classification rule for skewed unimodal distributions. For low dimensional data, the classifier is based on minimizing the adjusted outlyingness to each group. In the case of high dimensional data, the robustified SIMCA method is adjusted for skewness. The robustness of the methods is investigated through different simulations and by applying it to some datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Azzalini A, Dalla Valle A (1996) The multivariate skew-normal distribution. Biometrika 83: 715–726

    Article  MATH  MathSciNet  Google Scholar 

  • Brys G, Hubert M, Rousseeuw PJ (2005) A robustification of independent component analysis. J Chemom 19: 364–375

    Article  Google Scholar 

  • Brys G, Hubert M, Struyf A (2004) A robust measure of skewness. J Comput Graph Stat 13: 996–1017

    Article  MathSciNet  Google Scholar 

  • Cheng AY, Ouyang M (2001) On algorithms for simplicial depth. In: Proceedings 13th Canadian conference on computational geometry, pp 53–56

  • Croux C, Dehon C (2001) Robust linear discriminant analysis using S-estimators. Can J Stat 29: 473–492

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho DL (1982) Breakdown properties of multivariate location estimators. PhD thesis, Harvard University

  • Dutta S, Ghosh AK (2009) On robust classification using projection depth. Indian Statistical Institute, Technical report R11/2009

  • Ghosh AK, Chaudhuri P (2005) On maximum depth and related classifiers. Scand J Stat Theory Appl 32(2): 327–350

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  • He X, Fung WK (2000) High breakdown estimate for multiple populations with applications to discriminant analysis. J Multivar Anal 72: 151–162

    Article  MATH  MathSciNet  Google Scholar 

  • Hubert M, Engelen S (2004) Robust PCA and classification in biosciences. Bioinformatics 20: 1728–1736

    Article  Google Scholar 

  • Hubert M, Rousseeuw PJ, Vanden Branden K (2005) ROBPCA: a new approach to robust principal component analysis. Technometrics 47: 64–79

    Article  MathSciNet  Google Scholar 

  • Hubert M, Rousseeuw PJ, Verdonck T (2009) Robust PCA for skewed data. Comput Stat Data Anal 53: 2264–2274

    Article  MATH  Google Scholar 

  • Hubert M, Van der Veeken S (2008) Outlier detection for skewed data. J Chemom 22: 235–246

    Article  Google Scholar 

  • Hubert M, Van der Veeken S (2010) Fast and robust classifiers adjusted for skewness. In: Proceedings of Compstat 2010. Springer, Berlin

  • Hubert M, Van Driessen K (2004) Fast and robust discriminant analysis. Comput Stat Data Anal 45: 301–320

    Article  MATH  MathSciNet  Google Scholar 

  • Johnson RA, Wichern DW (1998) Applied multivariate statistical analysis. Prentice Hall Inc., Englewood Cliffs

    Google Scholar 

  • Liu RY (1990) On a notion of data depth based on random simplices. Ann Stat 18(1): 405–414

    Article  MATH  Google Scholar 

  • Rousseeuw PJ, Ruts I (1996) Bivariate location depth. Appl Stat 45: 516–526

    Article  MATH  Google Scholar 

  • Rousseeuw PJ, Struyf A (1998) Computing location depth and regression depth in higher dimensions. Stat Comput 8: 193–203

    Article  Google Scholar 

  • Stahel WA (1981) Robuste schätzungen: infinitesimale optimalität und schätzungen von kovarianzmatrizen. PhD thesis, ETH Zürich

  • Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore

    Book  MATH  Google Scholar 

  • Tukey JW (1975) Mathematics and picturing of data. In: Proceedings of the international congress of mathematicians, vol 2, pp 523–531

  • Vanden Branden K, Hubert M (2005) Robust classification in high dimensions based on the SIMCA method. Chemom Intell Lab Syst 79: 10–21

    Article  Google Scholar 

  • Verboven S, Hubert M (2005) LIBRA: a Matlab library for robust analysis. Chemom Intell Lab Syst 75: 127–136

    Article  Google Scholar 

  • Wold S (1976) Pattern recognition by means of disjoint principal component models. Pattern Recognit 8: 127–139

    Article  MATH  Google Scholar 

  • Zuo Y, Serfling R (2000) General notions of statistical depth function. Ann Stat 28: 461–482

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mia Hubert.

Additional information

We acknowledge financial support by the GOA/07/04-project of the Research Fund K.U.Leuven and by the IAP research network no. P6/03 of the Federal Science Policy, Belgium.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hubert, M., Van der Veeken, S. Robust classification for skewed data. Adv Data Anal Classif 4, 239–254 (2010). https://doi.org/10.1007/s11634-010-0066-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-010-0066-3

Keywords

Mathematics Subject Classification (2000)

Navigation