Abstract
We present a flexible non-parametric generative model for multilevel regression that strikes an automatic balance between identifying common effects across groups while respecting their idiosyncrasies. The model is built using techniques that are now considered standard in the statistical parameter estimation literature, namely, Hierarchical Dirichlet processes (HDP) and Hierarchical Generalized Linear Models (HGLM), and therefore, we name it “Infinite Mixtures of Hierarchical Generalized Linear Models” (iHGLM). We demonstrate how the use of a HDP prior in local, groupwise GLM modeling of response-covariate densities allows iHGLM to capture latent similarities and differences within and across groups. We demonstrate iHGLM’s superior accuracy in comparison to well known competing methods like Generalized Linear Mixed Model (GLMM), Regression Tree, Least Square Regression, Bayesian Linear Regression, Ordinary Dirichlet Process Regression, and several other regression models on several synthetic and real world datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antoniak, C.: Mixtures of dirichlet processes with applications to Bayesian nonparametric problems. Ann. Stat. 2(6), 1152–1174 (1974)
Blackwell, D., MacQueen, J.B.: Ferguson distributions via polya urn schemes. Ann. Stat. 1(2), 353–355 (1973)
Breslow, N.E., Clayton, D.G.: Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 88(421), 9–25 (1993)
Ferguson, T.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1, 209–230 (1973)
Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. Ser. B (Methodological) 55(4), 757–796 (1993)
Jordan, M., Jacobs, R.: Hierarchical mixtures of experts and the EM algorithm. International Joint Conference on Neural Networks (1993)
Lee, Y., Nelder, J.A.: Hierarchical generalized linear models. J. R. Stat. Soc. Ser. B (Methodological) 58(4), 619–678 (1996)
Lee, Y., Nelder, J.A.: Hierarchical generalised linear models: a synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika 88(4), 987–1006 (2001)
Lee, Y., Nelder, J.A.: Modelling and analysing correlated non-normal data. Stat. Model. 1(1), 3–16 (2001)
Lee, Y., Nelder, J.A.: Double hierarchical generalized linear models (with discussion). J. R. Stat. Soc.: Ser. C (Appl. Stat.) 55(2), 139–185 (2006)
Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)
Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. R. Stat. Soc. Ser. A (Gen.) 135(3), 370–384 (1972)
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005)
Robinson, A.P., Wykoff, W.R.: Imputing missing height measures using a mixed-effects modeling strategy. Can. J. For. Res. 34, 2492–2500 (2004)
Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)
Teh, Y.W., Jordan, M.I., Beal, M., Blei, D.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 1566–1581 (2006)
Viele, K., Tong, B.: Modeling with mixtures of linear regressions. Stat. Comput. 12(4), 315–330 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Islam, S.M., Banerjee, A. (2017). Automatic Discovery of Common and Idiosyncratic Latent Effects in Multilevel Regression. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10234. Springer, Cham. https://doi.org/10.1007/978-3-319-57454-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-57454-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57453-0
Online ISBN: 978-3-319-57454-7
eBook Packages: Computer ScienceComputer Science (R0)