Abstract
In the context of resistant learning, outliers are the observations far away from the fitting function that is deduced from a subset of the given observations and whose form is adaptable during the process. This study presents a resistant learning procedure for coping with outliers via single-hidden layer feed-forward neural network (SLFN). The smallest trimmed sum of squared residuals principle is adopted as the guidance of the proposed procedure, and key mechanisms are: an analysis mechanism that excludes any potential outliers at early stages of the process, a modeling mechanism that deduces enough hidden nodes for fitting the reference observations, an estimating mechanism that tunes the associated weights of SLFN, and a deletion diagnostics mechanism that checks to see if the resulted SLFN is stable. The lake data set is used to demonstrate the resistant-learning performance of the proposed procedure.
Similar content being viewed by others
References
Atkinson, A.: Plots, Transformations and Regression. Oxford University Press, Oxford (1985)
Atkinson, A., Cheng, T.: Computing least trimmed squares regression with the forward search. Stat. Comput. 9, 251–263 (1999)
Atkinson, A., Riani, M.: Forward search added-variable t-test and the effect of masked outliers on model selection. Biometrika. 89, 939–946 (2002)
Beguin, C., Chambers, R., Hulliger, B.: Evaluation of edit and imputation using robust methods. In: Methods and Experimental Results from the Euredit Project, chapter 2. http://www.cs.york.ac.uk/euredit/ (2002)
Cook, R.D., Weisberg, S.: Residuals and Influence in Regression. Chapman & Hall, London (1982)
Dempster, A., Gasko-Green, M.: New tools for residual analysis. Ann. Stat. 9, 945–959 (1981)
Hampel, F., Ronchetti, E., Rousseeuw, P., Stahel, W.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)
Hawkins, S., He, H., Williams, G., Baxter, R.: Outlier detection using neural networks. In: Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery (DaWaK02), pp. 170–180 (2002)
Hoeting, J., Raftery, A., Madigan, D.: A method for simultaneous variable selection and outlier identification in linear regression. Comput. Stat. Data Anal. 22, 251–270 (1996)
Huber, J.: Robust Statistics. Wiley, New York (1981)
Hubert, M., Engelen, S.: Fast cross-validation of high-breakdown resampling methods for PCA. Comput. Stat. Data Anal. 51, 5013–5024 (2007)
Knorr, E., Ng, R.: A unified approach for mining outliers. Proc. KDD, pp. 219–222 (1997)
Knorr, E., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: Proc. 24th Int. Conf. Very Large Data Bases, pp. 392–403 (1998)
Knorr, E., Ng, R., Tucakov, V.: Distance-based outliers: algorithm and applications. Very Large Data Bases. 8, 237–253 (2000)
Ronchetti, E., Field, C., Blanchard, W.: Robust linear model selection by cross-validation. J. Am. Stat. Assoc. 92, 1017–1023 (1997)
Rousseeuw, P.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984)
Rousseeuw, P., Leroy, A.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Rousseeuw, P., Van Driessen, K.: Computing LTS Regression for Large Data Sets. Tech. Rep. University of Antwerp, Belgium (1999)
Rumelhart, D., Hinton, G., Williams, R.: Modeling internal representations by error propagation. In: Rumelhart, D., McClelland, J. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 318–362. MIT Press, Cambridge, MA (1986)
Stromberg, A.: Computation of high breakdown nonlinear regression parameters. J. Am. Stat. Assoc. 88, 237–244 (1993)
Stromberg, A., Ruppert, D.: Breakdown in nonlinear regression. J. Am. Stat. Assoc. 87, 991–997 (1992)
Sykacek, P.: Equivalent error bars for neural network classifiers trained by Bayesian inference. In: Proceedings of the European Symposium on Artificial Neural Networks (Bruges, 1997), pp. 121–126 (1997)
Tsaih, R.: The reasoning neural networks. In: Ellacott, S., Mason, J., Anderson, I. (eds.) Mathematics of Neural Networks: Models, Algorithms and Applications, pp. 367. Kluwer Academic Publishers, London (1997)
Williams, G., Baxter, R., He, H., Hawkins, S., Gu, L.: A comparative study of RNN for outlier detection in data mining. In: Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM02), pp. 709–712 (2002)
Windham, M.: Robustifying model fitting. J. R. Stat. Soc. 57(B), 599–609 (1995)
Zaman, A., Rousseeuw, P., Orhan, M.: Econometric applications of high-breakdown robust regression techniques. Econometrics Lett. 71, 1–8 (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tsaih, RH., Cheng, TC. A resistant learning procedure for coping with outliers. Ann Math Artif Intell 57, 161–180 (2009). https://doi.org/10.1007/s10472-010-9183-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-010-9183-0
Keywords
- Resistant learning
- Outliers
- Single-hidden layer feed-forward neural networks
- Smallest trimmed sum of squared residuals principle
- Deletion diagnostics