Summary
Additive models of the type y=f1(x1)+...+fp(xp)+ε where fj, j=1,..,p, have unspecified functional form, are flexible statistical regression models which can be used to characterize nonlinear regression effects. One way of fitting additive models is the expansion in B-splines combined with penalization which prevents overfitting. The performance of this penalized B-spline (called P-spline) approach strongly depends on the choice of the amount of smoothing used for components fj. In particular for higher dimensional settings this is a computationaly demanding task. In this paper we treat the problem of choosing the smoothing parameters for P-splines by genetic algorithms. In several simulation studies this approach is compared to various alternative methods of fitting additive models. In particular functions with different spatial variability are considered and the effect of constant respectively local adaptive smoothing parameters is evaluated.
Similar content being viewed by others
References
Baker, J.: 1987, Reducing bias and inefficiency in the selection algorithm, in J. Greffenstette (ed.), Proceedings of the Second International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 14–21.
Biller, C. and Fahrmeir, L: 2002, Bayesian varying-coefficient models using adaptive regression splines, Statistical Modelling 11, 1–17.
Buja, A., Hastie, T. J. and Tibshirani, R. J.: 1989, Linear smoothers and additive models (with discussion), Annals of Statistics 17, 453–555.
Chapelle, O. and Vapnik, V.: 1999, Model selection for support vector machines, in S. A. Solla, T. K. Leen and K. R. Müller (eds), Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, MA.
de Boor, C.: 1978, A Practical Guide to Splines, Springer, New York, Heidelberg, Berlin.
deJong, K. A.: 1975, An Arzalysis of the Behaviour of a Class of Genetic Adaptive Systems, Ph.D. Thesis, Department of Computer and Communication Sciences, University of Michigan, An Arbor.
Dierckx, P.: 1995, Curve and Surface Fitting with Splines, Clarendon Press, Oxford.
Eilers, P. H. C. and Marx, B. D.: 1996, Flexible smoothing with b-splines and penalties, Stat. Science 11(2), 89–121.
Friedman, J.: 1991, Multivariate adaptive regression splines (with discussion), Annals of Statistics 19(1), 1–141.
Goldberg, D. E.: 1989 Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA.
Gu, C. and Wahba, G.: 1991, Minimizing gcv/gml scores with multiple smoothing parameters via the newton methods, SIAM Journal of Scientific and Statistical Computing 12, 383–398.
Hastie, T. J. and Tibshirani, R. J.: 1990, Generalized Additive Models, Chapman and Hall, London.
Hastie, T. J., Tibshirani, R. J. and Friedman, J.: 2001, The Elements of Statistical Learning, Springer, New York.
Herrera, F., Lozano, M. and Verdegay, J. L.: 1998, Tackling real-coded genetic algorithms: Operators and tools for behavioural analysis, Artificial Intelligence Review 12 (4), 265–319.
Holland, J.: 1975, Adaption in neural and artificial systems, University of Michigan Press, Ann Arbor.
Hurvich, C. and Simonoff, J.: 1998, Smoothing parameter selection in nonparametric regression using an improved akaike information criterion, Journal of the Royal Statistical Society B 60(2), 271–293.
Krause, R.: 2004, Genetic Algorithms as Tool for Statistical Analysis of High-Dimensional Data Structures, Logos Verlag, Berlin.
Lang, S. and Brezger, A.: 2004, Bayesian p-splines, Journal of Computational and Graphical Statistics 13, 183–212.
Mammen, E., Linton, O. and Nielsen, J. P.: 1999, The existence and asymptotic properties of a backfitting projection algorithm under weak conditions, Annals of Statistics 27, 1443–1490.
Michalewicz, Z.: 1996, Genetic Algorithms+Data Structures=Evolution Programs, Springer, Berlin, Heidelberg.
Parise, H., Wand, M. P., Ruppert, D. and Ryan, L.: 2001, Incorporation of historical controls using semiparametric mixed models, Applied Statistics 50(1), 31–42.
Rawlings, J. O., Pantula, S. G. and Dickey, D. A.: 1998, Applied Regression Analysis, Springer, New York.
Ruppert, D. and Carroll, R.: 2000, Spatially-adaptive penalties for spline fitting, Australian and New Zealand Journal of Statistics 42(2), 205–223.
Tipping, M. E.: 2001, Sparse bayesian learning and the relevance vector machine, Journal of Machine Learning Research 1, 211–244.
Tjostheim, D. and Auestad, B.: 1994, Nonparametric identification of nonlinear time series: projections, Journal of the American Statistical Association 89, 1398–1409.
Wood, S.: 2000, Modelling and smoothing parameter estimation with multiple quadratic penalties, Journal of the Royal Statistical Society B 62(2), 413–428.
Wood, S.: 2001, mgcv: Gams and generalized ridge regression for R, R News 1 (2), 20–25.
Acknowledgement
We thank Stefan Lang, Thomas Kneib and David Rummel for assistance and the supply of some simulation results we have used in this study.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Krause, R., Tutz, G. Genetic algorithms for the selection of smoothing parameters in additive models. Computational Statistics 21, 9–31 (2006). https://doi.org/10.1007/s00180-006-0248-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-006-0248-9