Skip to main content
Log in

A resistant learning procedure for coping with outliers

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

In the context of resistant learning, outliers are the observations far away from the fitting function that is deduced from a subset of the given observations and whose form is adaptable during the process. This study presents a resistant learning procedure for coping with outliers via single-hidden layer feed-forward neural network (SLFN). The smallest trimmed sum of squared residuals principle is adopted as the guidance of the proposed procedure, and key mechanisms are: an analysis mechanism that excludes any potential outliers at early stages of the process, a modeling mechanism that deduces enough hidden nodes for fitting the reference observations, an estimating mechanism that tunes the associated weights of SLFN, and a deletion diagnostics mechanism that checks to see if the resulted SLFN is stable. The lake data set is used to demonstrate the resistant-learning performance of the proposed procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Atkinson, A.: Plots, Transformations and Regression. Oxford University Press, Oxford (1985)

    MATH  Google Scholar 

  2. Atkinson, A., Cheng, T.: Computing least trimmed squares regression with the forward search. Stat. Comput. 9, 251–263 (1999)

    Article  Google Scholar 

  3. Atkinson, A., Riani, M.: Forward search added-variable t-test and the effect of masked outliers on model selection. Biometrika. 89, 939–946 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  4. Beguin, C., Chambers, R., Hulliger, B.: Evaluation of edit and imputation using robust methods. In: Methods and Experimental Results from the Euredit Project, chapter 2. http://www.cs.york.ac.uk/euredit/ (2002)

  5. Cook, R.D., Weisberg, S.: Residuals and Influence in Regression. Chapman & Hall, London (1982)

    MATH  Google Scholar 

  6. Dempster, A., Gasko-Green, M.: New tools for residual analysis. Ann. Stat. 9, 945–959 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  7. Hampel, F., Ronchetti, E., Rousseeuw, P., Stahel, W.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)

    MATH  Google Scholar 

  8. Hawkins, S., He, H., Williams, G., Baxter, R.: Outlier detection using neural networks. In: Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery (DaWaK02), pp. 170–180 (2002)

  9. Hoeting, J., Raftery, A., Madigan, D.: A method for simultaneous variable selection and outlier identification in linear regression. Comput. Stat. Data Anal. 22, 251–270 (1996)

    Article  MATH  Google Scholar 

  10. Huber, J.: Robust Statistics. Wiley, New York (1981)

    Book  MATH  Google Scholar 

  11. Hubert, M., Engelen, S.: Fast cross-validation of high-breakdown resampling methods for PCA. Comput. Stat. Data Anal. 51, 5013–5024 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  12. Knorr, E., Ng, R.: A unified approach for mining outliers. Proc. KDD, pp. 219–222 (1997)

  13. Knorr, E., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: Proc. 24th Int. Conf. Very Large Data Bases, pp. 392–403 (1998)

  14. Knorr, E., Ng, R., Tucakov, V.: Distance-based outliers: algorithm and applications. Very Large Data Bases. 8, 237–253 (2000)

    Article  Google Scholar 

  15. Ronchetti, E., Field, C., Blanchard, W.: Robust linear model selection by cross-validation. J. Am. Stat. Assoc. 92, 1017–1023 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  16. Rousseeuw, P.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  17. Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection. Wiley, New York (1987)

    Book  MATH  Google Scholar 

  18. Rousseeuw, P., Van Driessen, K.: Computing LTS Regression for Large Data Sets. Tech. Rep. University of Antwerp, Belgium (1999)

  19. Rumelhart, D., Hinton, G., Williams, R.: Modeling internal representations by error propagation. In: Rumelhart, D., McClelland, J. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 318–362. MIT Press, Cambridge, MA (1986)

    Google Scholar 

  20. Stromberg, A.: Computation of high breakdown nonlinear regression parameters. J. Am. Stat. Assoc. 88, 237–244 (1993)

    Article  MATH  Google Scholar 

  21. Stromberg, A., Ruppert, D.: Breakdown in nonlinear regression. J. Am. Stat. Assoc. 87, 991–997 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  22. Sykacek, P.: Equivalent error bars for neural network classifiers trained by Bayesian inference. In: Proceedings of the European Symposium on Artificial Neural Networks (Bruges, 1997), pp. 121–126 (1997)

  23. Tsaih, R.: The reasoning neural networks. In: Ellacott, S., Mason, J., Anderson, I. (eds.) Mathematics of Neural Networks: Models, Algorithms and Applications, pp. 367. Kluwer Academic Publishers, London (1997)

    Google Scholar 

  24. Williams, G., Baxter, R., He, H., Hawkins, S., Gu, L.: A comparative study of RNN for outlier detection in data mining. In: Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM02), pp. 709–712 (2002)

  25. Windham, M.: Robustifying model fitting. J. R. Stat. Soc. 57(B), 599–609 (1995)

    MATH  MathSciNet  Google Scholar 

  26. Zaman, A., Rousseeuw, P., Orhan, M.: Econometric applications of high-breakdown robust regression techniques. Econometrics Lett. 71, 1–8 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rua-Huan Tsaih.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsaih, RH., Cheng, TC. A resistant learning procedure for coping with outliers. Ann Math Artif Intell 57, 161–180 (2009). https://doi.org/10.1007/s10472-010-9183-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-010-9183-0

Keywords

Mathematics Subject Classifications (2010)

Navigation