Abstract
The concept of degrees of freedom plays an important role in statistical modeling and is commonly used for measuring model complexity. The number of unknown parameters, which is typically used as the degrees of freedom in linear regression models, may fail to work in some modeling procedures, in particular for linear mixed effects models. In this article, we propose a new definition of generalized degrees of freedom in linear mixed effects models. It is derived from using the sum of the sensitivity of the expected fitted values with respect to their underlying true means. We explore and compare data perturbation and the residual bootstrap to empirically estimate model complexity. We also show that this empirical generalized degrees of freedom measure satisfies some desirable properties and is useful for the selection of linear mixed effects models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Second International Symposium on Information Theory (Tsahkadsor, 1971), pp. 267–281. Akadémiai Kiadó (1973)
Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)
Cui, Y., Hodges, J.S., Kong, X., Carlin, B.P.: Partitioning degrees of freedom in hierarchical and other richly parameterized models. Technometrics A J. Stat. Phys. Chem. Eng. Sci. 52(1), 124–136 (2010)
Efron, B.: The estimation of prediction error: covariance penalties and cross-validation. J. Am. Stat. Assoc. 99(467), 619–642 (2004)
Greven, S., Kneib, T.: On the behaviour of marginal and conditional AIC in linear mixed models. Biometrika 97(4), 773–789 (2010). with supplementary material available online.
Hodges, J.S., Sargent, D.J.: Counting degrees of freedom in hierarchical and other richly-parameterised models. Biometrika 88(2), 367–379 (2001)
Jiang, J.: REML estimation: asymptotic behavior and related topics. Ann. Stat. 24(1), 255–286 (1996)
Jiang, J.: Asymptotic properties of the empirical blup and blue in mixed linear models. Stat. Sin. 8(3), 861–885 (1998)
Kato, K.: On the degrees of freedom in shrinkage estimation. J. Multivar. Anal. 100(7), 1338–1352 (2009)
Laird, N.M., Ware, J.H.: Random-effects models for longitudinal data. Biometrics 38(4), 963–974 (1982)
Lee, Y., Nelder, J.A.: Conditional and marginal models: another view. Stat. Sci. 19(2), 219–238 (2004)
Lian, I.B.: Reducing over-dispersion by generalized degree of freedom and propensity score. Comput. Stat. Data Anal. 43(2), 197–214 (2003)
Liang, H., Wu, H., Zou, G.: A note on conditional AIC for linear mixed-effects models. Biometrika 95(3), 773–778 (2008)
McCulloch, C.E., Searle, S.R.: Generalized, Linear, and Mixed Models. Wiley Series in Probability and Statistics: Texts, References, and Pocketbooks Section. Wiley, New York (2001)
Meyer, M., Woodroofe, M.: On the degrees of freedom in shape-restricted regression. Ann. Stat. 28, 1083–1104 (2000)
Müller, S., Scealy, J.L., Welsh, A.H.: Model selection in linear mixed models. Stat. Sci. 28(2), 135–167 (2013)
Rueda, C.: Degrees of freedom and model selection in semiparametric additive monotone regression. J. Multivar. Anal. 117, 88–99 (2013)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Shang, J., Cavanaugh, J.E.: Bootstrap variants of the Akaike information criterion for mixed model selection. Comput. Stat. Data Anal. 52(4), 2004–2021 (2008)
Shen, X., Huang, H.C.: Optimal model assessment, selection, and combination. J. Am. Stat. Assoc. 101(474), 554–568 (2006)
Shen, X., Ye, J.: Adaptive model selection. J. Am. Stat. Assoc. 97(457), 210–221 (2002)
Shen, X., Huang, H.C., Ye, J.: Adaptive model selection and assessment for exponential family distributions. Technometrics 46(3), 306–317 (2004)
Stein, C.M.: Estimation of the mean of a multivariate normal distribution. Ann. Stat. 9(6), 1135–1151 (1981)
Tibshirani, R., Taylor J.: Degrees of freedom in lasso problems. Ann. Stat. 40(2), 1198–1232 (2012)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58(1), 267–288 (1996)
Vaida, F., Blanchard, S.: Conditional Akaike information for mixed-effects models. Biometrika 92(2), 351–370 (2005)
Vaiter, S., Deledalle, C., Peyre, G., Fadilmi J., Dossal, C.: The degrees of freedom of the group lasso for a general design. Tech. Rep. Preprint Hal-00768896 (2012)
Verbeke, G., Molenberghs, G.: Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. Springer, New York (2000)
Ye, J.: On measuring and correcting the effects of data mining and model selection. J. Am. Stat. Assoc. 93(441), 120–131 (1998)
Zhang, B., Shen, X., Mumford, S.L.: Generalized degrees of freedom and adaptive model selection in linear mixed-effects models. Comput. Stat. Data Anal. 56(3), 574–586 (2012)
Zou, H., Hastie, T., Tibshirani, R.: On the degrees of freedom of the lasso. Ann. Stat. 35(5), 2173–2192 (2007)
Acknowledgments
We are very thankful to the comments from the associate editor and the three referees in the original draft. These led to a significantly improved presentation of the article. This research was partially supported by Australian Research Council Discovery Project DP110101998 (SM) and Australian Research Council Discovery Early Career Award DE130101670 (JO).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
You, C., Müller, S. & Ormerod, J.T. On generalized degrees of freedom with application in linear mixed models selection. Stat Comput 26, 199–210 (2016). https://doi.org/10.1007/s11222-014-9488-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9488-7