Abstract
Recent developments of multivariate smoothing methods provide a rich collection of feasible models for nonparametric multivariate data analysis. Among the most interpretable are those with smoothed additive terms. Construction of various methods and algorithms for computing the models have been the main concern in literature in this area. Less results are available on the validation of computed fit, instead, and many applications of nonparametric methods end up in computing and comparing the generalized validation error or related indexes. This article reviews the behaviour of some of the best known multivariate nonparametric methods, based on subset selection and on projection, when (exact) collinearity or multicollinearity (near collinearity) is present in the input matrix. It shows the possible aliasing effects in computed fits of some selection methods and explores the properties of the projection spaces reached by projection methods in order to help data analysts to select the best model in case of ill conditioned input matrices. Two simulation studies and a real data set application are presented to illustrate further the effects of collinearity or multicollinearity in the fit.
Similar content being viewed by others
References
Becker RA, Chambers JM, Wilks AR (1988) The new S language: a programming environment for data analysis and graphics. Wadsworth & Brooks, Pacific Grove
Belsley DA (1984) Demeaning conditioning diagnostics through centering (with discussion). Am Stat, 38: 73–77
Belsley DA (1991) Conditioning diagnostics, collinearity and weak data in regression. Wiley, New York
Belsley DA, Kuh E, Welsch RE (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, New York
Bishop C (1995) Neural networks for pattern recognition. Clarendon , Oxford
Breiman L, Friedman JH, Olshen, RA, Stone CJ (1984) Classification and regression trees. Wadsworth, California
Buja A, Donnel D, Stuetzle W (1986) Additive principal components. Technical Report, Department of Statistics, University of Washington
Buja A, Hastie TJ, Tibshirani R, (1989) Linear smoothers and additive models. Ann Stat 17: 453–555
Clark L–A. Pregibon D (1992) Tree based models. In: Chambers J.M, Hastie T.J (eds) Statistical models in S. Chapman Hall, New york
De Veaux RD, Psichogios DC, Hungar LH (1993) A comparison of two non parametric estimation schemes: MARS and neural networks. Comput Chemi Eng 17 (8): 819–837
De Veaux RD, Hungar LH, (1994) Multicollinearity: a tale of two nonparametric regressions. In Cheeseman P, Oldford RW (eds) Selecting models from data: AI and statistics VI
Donnel DJ, Buja A, Stuetzle W (1994) Analysis of additive dependencies and concurvity using smallest additive principal components. Ann Stati 22: 1635–1673
Eubank Speckman (1989) Discussion of “linear smoothers and additive models” by Buja A, Hastie TJ & Tibshirani R. Ann Stat 17: 525–529
Friedman JH (1984) Classification and multiple response regression trough projection pursuit. Department of Statistics, Stanford University, Report LCM006
Friedman JH (1991) Multivariate adaptive regression splines. The Annals of Statistics 19:1–141
Friedman JH, Stuetzle W (1981). Projection pursuit regression. Journal of the American Statistical Association 76:817–823
Gu C (1992) Diagnostics for nonparametric regression models with additive terms. J Ame Stat Assoc 87: 1051–1058
Guerin-Dugue A et al (1995). Deliverable R3-B4-P task B4: benchmarks, Technical report, Elena-NervesII “Enhanced learning for evolutive neural architecture” ESPRIT-Basic Research Project Number 6891.
Hastie TJ, Tibshirani R (1986) Generalized additive models. Stat Sci 1: 297–318
Hastie TJ, Tibshirani R (1990). Generalized additive models. Chapman, London
Hastie TJ, Tibshirani R, Friedman JH (2001) The elements of statistical learning, data mining, inference and prediction. Springer, New York
Householder AS (1964) The theory of matrices in Numerical Analysis. Dover, New York
Ingrassia S (1999) Geometrical aspects of discrimination by multilayer perceptions. J Multivar Analy 68: 226–234
Ingrassia S, Morlini I (2005) Neural network modelling for small data sets. Technometrics, 47(3): 297–312
Michie D, Spiegelhalter DJ, Taylor CC, (eds) (1994) Machine learning, neural and statistical classification, Ellis Horwood Series in Artificial Intelligence, UK
Ripley BD, (1996). Pattern recognition and neural networks. Cambridge University Press, Cambridge, UK
Stewart GW (1992) Collinearity and least squares regression. Stat Sci 2: 68–100
Venables WN, Ripley BD (1994) Modern applied statistics with S-Plus. Springer, Berlin Heidelberg, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Morlini, I. On Multicollinearity and Concurvity in Some Nonlinear Multivariate Models. JISS 15, 3–26 (2006). https://doi.org/10.1007/s10260-006-0005-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-006-0005-9