Skip to main content

Detecting Multivariate Outliers Using Projection Pursuit with Particle Swarm Optimization

  • Conference paper
  • First Online:
Proceedings of COMPSTAT'2010

Abstract

Detecting outliers in the context of multivariate data is known as an important but difficult task and there already exist several detection methods. Most of the proposed methods are based either on the Mahalanobis distance of the observations to the center of the distribution or on a projection pursuit (PP) approach. In the present paper we focus on the one-dimensional PP approach which may be of particular interest when the data are not elliptically symmetric. We give a survey of the statistical literature on PP for multivariate outliers etection and investigate the pros and cons of the different methods. We also propose the use of a recent heuristic optimization algorithm called Tribes for multivariate outliers detection in the projection pursuit context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • ACHARD, V., LANDREVIE, A. and FORT, J.-C. (2004): Anomalies detection in hyperspectral imagery using projection pursuit algorithm In: L. Bruzzone (Ed): Image and Signal Processing for Remote Sensing X.Proceedings of the SPIE, Vol. 5573, 193–202.

    Google Scholar 

  • BARNETT, V. and LEWIS, T. (1994): Outliers in statistical data, third edition. Wiley.

    Google Scholar 

  • BERRO, A., LARABI MARIE-SAINTE, S. and RUIZ-GAZEN, A. (2009): Genetic and Particle Swarm Optimization for Exploratory Projection Pursuit. Submited.

    Google Scholar 

  • CAUSSINUS, H., FEKRI, M., HAKAM, S. and RUIZ-GAZEN, A. (2003): A monitoring display of Multivariate Outliers. Computational Statististics and Data Analysis 44, 237–252

    Article  MathSciNet  MATH  Google Scholar 

  • CAUSSINUS, H. and RUIZ-GAZEN, A. (1990): Interesting projections of multidimensional data by means of generalized principal component analysis, COMPSTAT 90, Physica-Verlag, 121–126.

    Google Scholar 

  • CAUSSINUS, H. and RUIZ-GAZEN, A. (2009): Exploratory projection pursuit. In: G. Govaert: Data Analysis (Digital Signal and Image Processing series). Wiley, 67–89.

    Google Scholar 

  • CERIOLI, A., RIANI, M. and ATKINSON A. C. (2009): Controlling the size of multivariate outlier tests with the MCD estimator of scatter. Statistics and Computing 19, 341–353.

    Article  MathSciNet  Google Scholar 

  • CLERC, M. (2005): L’optimization par essaims particulaires. Lavoisier.

    Google Scholar 

  • COOK, D. , BUJA. A. and CABRERA, J. (1993): Projection Pursuit Indices Based on Orthogonal Function Expansions. Journal of Computational and Graphical Statistics 2, 225–250.

    Article  MathSciNet  Google Scholar 

  • COOK, D. and SWAYNE, D. F. (2007): Interactive and Dynamic Graphics for Data Analysis. Springer Verlag, New York.

    Book  MATH  Google Scholar 

  • COOREN, Y., CLERC, M. SIARRY, P. (2009): Performance evaluation of TRIBES, an adaptive particle swarm optimization algorithm. Swarm Intelligence 3, 149–178.

    Article  Google Scholar 

  • CROUX C. and RUIZ-GAZEN, A. (2005): High Breakdown Estimators for Principal Components: the Projection-Pursuit Approach Revisited. Journal of Multivariate Analysis, 95, 206-226.

    Article  MathSciNet  MATH  Google Scholar 

  • CROUX, C., FILZMOSER, P. and OLIVEIRA, M. R. (2007): Algorithms for projection-pursuit robust principal components analysis. Chemometrics and Intelligent Laboratory Systems, 87, 218-225.

    Article  Google Scholar 

  • DONOHO, D. L. (1982): Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Harvard University.

    Google Scholar 

  • EBERHART, R. C. and KENNEDY, J. (1995): A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micromachine and Human Science. Nagoya, Japan, 39–43.

    Google Scholar 

  • FRIEDMAN, J. H. (1987): Exploratory projection pursuit. Journal of the American Statistical Association, 82, 249–266.

    Article  MathSciNet  MATH  Google Scholar 

  • FRIEDMAN J. H. and TUKEY J. W. (1974): A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers, Ser. C, 23, 881–889.

    Article  MATH  Google Scholar 

  • GILLI, M. and SCHUMANN, E. (2009): Robust regression with optimization heuristics. Comisef Working paper series, WPS-011.

    Google Scholar 

  • GILLI, M. and WINKER, P. (2008): Review of heuristic optimization methods in econometrics. Comisef working papers series WPS-OO1.

    Google Scholar 

  • HADI, A. S., RAHMATULLAH IMON, A. H. M. and WERNER, M. (2009): Detection of outliers. Wiley Interdisciplinary Reviews: computational statistics, 1, 57-70.

    Article  Google Scholar 

  • HALL, P. (1989): On polynomial-based projection indexes for exploratory projection pursuit. The Annals of Statistics, 17, 589–605.

    Article  MathSciNet  MATH  Google Scholar 

  • HUBER, P. J. (1985): Projection pursuit. The Annals of Statistics, 13, 435–475.

    Article  MathSciNet  MATH  Google Scholar 

  • JOLLIFFE, I. T. (2002): Principal Component Analysis, second edition. Springer.

    Google Scholar 

  • JONES, M. C. and SIBSON, R. (1987): What is projection pursuit? Journal of the Royal Statistical Society, 150, 1–37.

    MathSciNet  MATH  Google Scholar 

  • JUAN, J. and PRIETO, F. J. (2001): Using angles to identify concentrated multivariate outliers. Technometrics 43, 311–322

    Article  MathSciNet  Google Scholar 

  • KENNEDY, J. and EBERHART, R. C. (with Yuhui Shi) (2001): Swarm Intelligence. Morgan Kaufmann.

    Google Scholar 

  • LARABI MARIE-SAINTE, S., RUIZ-GAZEN, A. and BERRO, A. (2009): Tribes: une méthode d’optimization efficace pour révéler des optima locaux d’un indice de projection. Preprint.

    Google Scholar 

  • LI, G. and CHEN, Z. (1985): Projection-pursuit approach to robust dispersion matrices and principal components: primary theory and Monte Carlo. Journal of the American Statistical Association, 80, 759–766.

    Article  MATH  Google Scholar 

  • MALPIKA, J. A., REJAS, J. G. and ALONSO, M. C. (2008): A projection pursuit algorithm for anomaly detection in hyperspectral imagery. Pattern recognition, 41, 3313–3327

    Article  Google Scholar 

  • MARONNA, R. A. and YOHAI, V. J. (1995). The behavior of the Stahel-Donoho robust multivariate estimator. Journal of the American Statistical Association, 90 (429), 330–341.

    Article  MathSciNet  MATH  Google Scholar 

  • NASON, G. P. (1992): Design and choice of projections indices. Ph.D. dissertation, University of Bath.

    Google Scholar 

  • PEÑA, D. and PRIETO, F. (2001): Multivariate outlier detection and robust covariance matrix estimation. Technometrics, 43, 286–310

    Article  MathSciNet  Google Scholar 

  • ROUSSEEUW, P. J. and VAN ZOMEREN, B. H. (1990): Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85, 633–639.

    Article  Google Scholar 

  • RUIZ-GAZEN, A. (1993): Estimation robuste d’une matrice de dispersion et projections révélatrices. Ph.D. Dissertation. Université Paul Sabatier. Toulouse.

    Google Scholar 

  • SMETEK, T. E. and BAUER, K. W. (2008): A Comparison of Multivariate Outlier Detection Methods for Finding Hyperspectral Anomalies. Military Operations Research, 13, 19–44.

    Google Scholar 

  • STAHEL, W. A. (1981): Breakdown of covariance estimators. Research report 31. Fachgruppe für Statistik, E.T.H. Zürich.

    Google Scholar 

  • SUN, J. (1991): Significance levels in exploratory projection pursuit. Biometrika, 78(4), 759–769.

    Article  MathSciNet  MATH  Google Scholar 

  • TYLER, D. E., CRITCHLEY F., DÃœMBGEN L. and OJA, H. (2009): Invariant co-ordinate selection. Journal of the Royal Statistical Society. Series B, 71(3), 549–592.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anne Ruiz-Gazen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ruiz-Gazen, A., Marie-Sainte, S.L., Berro, A. (2010). Detecting Multivariate Outliers Using Projection Pursuit with Particle Swarm Optimization. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_8

Download citation

Publish with us

Policies and ethics