Skip to main content
Log in

Parsimonious Classification Via Generalized Linear Mixed Models

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

We devise a classification algorithm based on generalized linear mixed model (GLMM) technology. The algorithm incorporates spline smoothing, additive model-type structures and model selection. For reasons of speed we employ the Laplace approximation, rather than Monte Carlo methods. Tests on real and simulated data show the algorithm to have good classification performance. Moreover, the resulting classifiers are generally interpretable and parsimonious.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BOYD, S., and VANDENBERGHE, L. (2004), Convex Optimization, New York: Cambridge University Press.

    MATH  Google Scholar 

  • BREIMAN, L. (2001), “Statistical Modeling: The Two Cultures (With Discussion)”, Statistical Science, 16, 199–231.

    Article  MATH  MathSciNet  Google Scholar 

  • BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984), Classification and Regression Trees, Belmont, California: Wadsworth Publishing.

    MATH  Google Scholar 

  • BRESLOW, N.E., and CLAYTON, D.G. (1993), “Approximate Inference in Generalized Linear Mixed Models”, Journal of the American Statistical Association, 88, 9–25.

    Article  MATH  Google Scholar 

  • BUJA, A., HASTIE, T., and TIBSHIRANI, R. (1989), “Linear Smoothers and Additive Models”, The Annals of Statistics, 17, 453–510.

    Article  MATH  MathSciNet  Google Scholar 

  • CHAMBERS, J. M., and HASTIE, T. J. (1992), Statistical Models in S, New York: Chapman and Hall.

    MATH  Google Scholar 

  • COX, D., and KOH, E. (1989), “A Smoothing Spline Based Test of Model Adequacy in Polynomial Regression”, Annals of the Institute of Statistical Mathematics, 41, 383–400.

    Article  MATH  MathSciNet  Google Scholar 

  • DURBÁN, M., and CURRIE, I. (2003), “A Note on P-Spline Additive Models with Correlated Errors”, Computational Statistics, 18, 263–292.

    MathSciNet  Google Scholar 

  • GRAY, R. J. (1994), “Spline-based Tests in Survival Analysis”, Biometrics, 50, 640–652.

    Article  MATH  MathSciNet  Google Scholar 

  • GUYON, I., and ELISSEEFF, A. (2003), “An Introduction to Variable and Feature Selection”, Journal of Machine Learning Research, 3, 1157–1182.

    Article  MATH  Google Scholar 

  • HAND, D.J. (2006), “Classifier Technology and the Illusion of Progress (With Discussion)”, Statistical Science, 21, 1–34.

    Article  MATH  MathSciNet  Google Scholar 

  • HASTIE, T. (2006), “Gam 0.97, R Package”, http://cran.r-project.org .

  • HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning, New York: Springer-Verlag.

    MATH  Google Scholar 

  • HASTIE, T.J., and TIBSHIRANI,R.J. (1990), Generalized AdditiveModels, London: Chapman and Hall.

    Google Scholar 

  • IMHOF, J.P. (1961), “Computing the Distribution of Quadratic Forms in Normal Variables”, Biometrika, 48, 419–426.

    MATH  MathSciNet  Google Scholar 

  • KAUERMANN, G., KRIVOBOKOVA, T., and FAHRMEIR, L. (2009), “Some Asymptotic Results on Generalized Penalized Spline Smoothing”, Journal of the Royal Statistical Society, Series B, 71, 487–503.

    Article  Google Scholar 

  • KOOPERBERG, C., BOSE, S., and STONE, C.J. (1997), “Polychotomous Regression.”, Journal of the American Statistical Association, 92, 117–127.

    Article  MATH  Google Scholar 

  • LIN, X. (1997), “Variance Component Testing in Generalised Linear Models with Random Effects”, Biometrika, 84, 309–326.

    Article  MATH  MathSciNet  Google Scholar 

  • MCCULLOCH, C.E., and SEARLE, S.R. (2000), Generalized, Linear, and Mixed Models, New York: John Wiley and Sons.

    Book  Google Scholar 

  • ORMEROD, J.T. (2008), “On Semiparametric Regression and Data Mining”, PhD Thesis, School of Mathematics and Statistics, The University of New South Wales, Sydney, Australia.

  • RAO, C.R. (1973), Linear Statistical Inference and Its Applications, New York: JohnWiley and Sons.

    Book  MATH  Google Scholar 

  • RUPPERT, D., WAND, M. P., and CARROLL, R.J. (2003), Semiparametric Regression, New York: Cambridge University Press.

    MATH  Google Scholar 

  • STONE, C. J., HANSEN, M. H., KOOPERBERG, C. ,and TRUONG, Y. K. (1997), “Polynomial Splines and Their Tensor Products in Extended Linear Modeling”, The Annals of Statistics, 25, 1371–1425.

    Article  MATH  MathSciNet  Google Scholar 

  • VAIDA, F., and BLANCHARD, S. (2005), “Conditional Akaike Information for Mixedeffect Models”, Biometrika, 92, 351–370.

    Article  MATH  MathSciNet  Google Scholar 

  • VERBEKE, G., and MOLENBERGHS, G. (2000), Linear Mixed Models for Longitudinal Data, New York: Springer-Verlag.

    MATH  Google Scholar 

  • WAGER, C., VAIDA, F., and KAUERMANN, G. (2007), “Model Selection for P-Spline Smoothing Using Akaike Information Criteria”, Australian and New Zealand Journal of Statistics, 49, 173–190.

    Article  MATH  MathSciNet  Google Scholar 

  • WAKEFIELD, J.C., BEST, N.G., and WALLER, L. (2000), “Bayesian Approaches to Disease Mapping”, in Spatial Epidemiology, eds. P. Elliott, J.C. Wakefield, N.G. Best, and D.J. Briggs, Oxford: Oxford University Press, pp. 104–127.

    Google Scholar 

  • WAND, M.P. (2002), “Vector Differential Calculus in Statistics”, The American Statistician, 56, 55–62.

    Article  MathSciNet  Google Scholar 

  • WAND, M. P. (2003), “Smoothing and Mixed Models”, Computational Statistics, 18, 223–249.

    MATH  Google Scholar 

  • WAND, M.P. (2007), “Fisher Information for Generalised Linear Mixed Models”, Journal of Multivariate Analysis, 98, 1412–1416.

    Article  MATH  MathSciNet  Google Scholar 

  • WAND, M.P., and Ormerod, J.T. (2008), “On Semiparametric Regression with O’Sullivan Penalised Splines”, Australian and New Zealand Journal of Statistics, 50, 179–198.

    Article  MATH  MathSciNet  Google Scholar 

  • WELHAM, S.J., CULLIS, B.R., KENWARD, M.G., and THOMPSON, R. (2007), “A Comparison ofMixedModel Splines for Curve Fitting”, Australian and New Zealand Journal of Statistics, 49, 1–23.

    Article  MATH  MathSciNet  Google Scholar 

  • WOOD, S.N. (2003), “Thin-plate Regression Splines”, Journal of the Royal Statistical Society, Series B, 65, 95–114.

    Article  MATH  Google Scholar 

  • WOOD, S.N. (2006), “Mgcv 1.3, R Package”, http://cran.r-project.org .

  • YAU, P., KOHN, R., and WOOD, S. (2003), “Bayesian Variable Selection and Model Averaging in High-Dimensional Multinomial Nonparametric Regression”, Journal of Computational and Graphical Statistics, 12, 1–32.

    Article  MathSciNet  Google Scholar 

  • ZHANG, D., and LIN, X. (2003), “Hypothesis Testing in Semiparametric Additive Mixed Models”, Biostatistics, 4, 57–74.

    Article  MATH  Google Scholar 

  • ZHAO, Y., STAUDENMAYER, J., COULL, B.A., and WAND, M.P. (2006), “General Design Bayesian Generalized Linear Mixed Models”, Statistical Science, 21, 35–51.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Kauermann.

Additional information

The first author acknowledges support from the Deutsche Forschungsgemeinschaft (Project D 310 122 40). The second and third authors acknowledge support from the Australian Research Council(Project DP0556518).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kauermann, G., Ormerod, J.T. & Wand, M.P. Parsimonious Classification Via Generalized Linear Mixed Models. J Classif 27, 89–110 (2010). https://doi.org/10.1007/s00357-010-9045-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-010-9045-9

Keywords

Navigation