Skip to main content
Log in

Modeling with Mixtures of Linear Regressions

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Consider data (x 1,y 1),...,(x n,y n), where each x i may be vector valued, and the distribution of y i given x i is a mixture of linear regressions. This provides a generalization of mixture models which do not include covariates in the mixture formulation. This mixture of linear regressions formulation has appeared in the computer science literature under the name “Hierarchical Mixtures of Experts” model.

This model has been considered from both frequentist and Bayesian viewpoints. We focus on the Bayesian formulation. Previously, estimation of the mixture of linear regression model has been done through straightforward Gibbs sampling with latent variables. This paper contributes to this field in three major areas. First, we provide a theoretical underpinning to the Bayesian implementation by demonstrating consistency of the posterior distribution. This demonstration is done by extending results in Barron, Schervish and Wasserman (Annals of Statistics 27: 536–561, 1999) on bracketing entropy to the regression setting. Second, we demonstrate through examples that straightforward Gibbs sampling may fail to effectively explore the posterior distribution and provide alternative algorithms that are more accurate. Third, we demonstrate the usefulness of the mixture of linear regressions framework in Bayesian robust regression. The methods described in the paper are applied to two examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barron A., Schervish M., and Wasserman L. 1999. The consistency of posterior distributions in nonparametric problems. Annals of Statistics 27: 536-561.

    Google Scholar 

  • Billingsley P. 1986. Probability and Measure, 2nd Edn. John Wiley and Sons.

  • Celeux G., Hurn M., and Robert C. 2000. Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association 95: 957-970.

    Google Scholar 

  • Cohen E. 1980. Inharmonic tone perception. Ph.D Dissertation, Stanford University.

  • Dacunha-Castelle D. and Gassiat E. 1997. Testing in locally conic models, and application to mixture models. ESAIM Probability and Statistics 1: 285-317.

    Google Scholar 

  • Davenport J., Bezdek J., and Hathaway R. 1988. Parameter estimation for finite mixture distributions. Comput. Math. Applic. 15: 819-828.

    Google Scholar 

  • Deibolt J. and Robert C. 1994. Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society B 56: 363-375.

    Google Scholar 

  • DeVeaux R. 1989. Mixtures of linear regressions. Computational Statistics and Data Analysis 8: 227-245.

    Google Scholar 

  • Feng Z. and McCulloch C. 1994. On the likelihood ratio test statistic for the number of components in a normal mixture with unequal variances. Biometrics 50: 1158-1169.

    Google Scholar 

  • Gelman A., Carlin J., Stern H., and Rubin D. 1995. Bayesian Data Analysis. Chapman and Hall.

  • Genovese C. and Wasserman L. 2000. Rates of convergence for the Gaussian mixture sieve. Annals of Statistics 28: 1105-1127.

    Google Scholar 

  • Hawkins D., Bradu D., and Kass G. 1984. Location of several outliers in multiple regression using elemental sets. Technometrics 26: 197-208.

    Google Scholar 

  • Hurn M., Justel A., and Robert C. 2000. Estimating mixtures of regressions. Technical Report, Department of Mathematics, University of Bath.

  • Jiang W. and Tanner M. 1999. Hierarchical mixtures of experts for exponential family regression models: Approximation and maximum likelihood estimation. Annals of Statistics 27: 987-1011.

    Google Scholar 

  • Jordan M. and Jacobs R. 1994. Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6: 181-214.

    Google Scholar 

  • Justel A. and Pena D. 1996. Gibbs sampling will fail in outlier problems with strong masking. Journal of Computational and Graphical Statistics 5: 176-189.

    Google Scholar 

  • Kass R. and Raftery A. 1995. Bayes factors. Journal of the American Statistical Association 90: 773-795.

    Google Scholar 

  • Kerebin C. 1998. Consistent estimation of the order of mixture models. Technical Report. Laboratoire Analyse et Probabilite, Universite d'Evry-Val d'Essonne.

  • Mengerson K. and Robert C. 1996. Testing for mixtures: A Bayesian entropic approach. In: Bernardo J., Berger J., Dawid P., and Smith A. (Eds.), Bayesian Statistics 5. Oxford University Press, pp. 255-276.

  • Muller P., Erkanli A., and West M. 1996. Bayesian curve fitting using multivariate normal mixtures. Biometrika 83: 67-79.

    Google Scholar 

  • Peng F., Jacobs R., and Tanner M. 1996. Bayesian inference in mixtures of experts and hierarchical mixtures of experts models with an application to speech recognition. Journal of the American Statistical Association 91: 953-960.

    Google Scholar 

  • Richardson S. and Green P. 1997. On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society B 59: 731-792.

    Google Scholar 

  • Robert C. 1996. Mixtures of distributions: Inference and estimation. In: Gilks W., Richardson S., and Spiegelhalter D. (Eds.), Practical Markov Chain Monte Carlo. Chapman and Hall, Ch. 24.

  • Roeder K. and Wasserman L. 1997. Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association 92: 894-902.

    Google Scholar 

  • Rousseeuw P. 1984. Least median of squares regression. Journal of the American Statistical Association 79: 871-880.

    Google Scholar 

  • Rousseeuw P. and van Zomeren B. 1990. Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association 85: 633-651.

    Google Scholar 

  • Stephens M. 2000. Bayesian analysis of mixture models with an unknown number of components—An alternative to reversible jump methods. Annals of Statistics 28: 40-74.

    Google Scholar 

  • Tierney L. 1994. Markov Chains for exploring posterior distributions (with discussion). Annals of Statistics 22: 1701-1762.

    Google Scholar 

  • Verdinelli I. and Wasserman L. 1991. Bayesian analysis of outlier problems using the Gibbs sampler. Statistical Computing 1: 105-117.

    Google Scholar 

  • Wasserman L. 2000. Asymptotic inference for mixture models using data dependent priors. Journal of the Royal Statistical Society B 62: 159-180.

    Google Scholar 

  • Waterhouse S., Mackay D., and Robinson T. 1996. Bayesian methods for mixtures of experts. In: Touretzky D., Mozer M., and Hasselmo M. (Eds.), Advances in Neural Information Processing Systems 8. Cambridge, MIT Press.

    Google Scholar 

  • Weisberg S. 1985. Applied Linear Regression. John Wiley and Sons.

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Viele, K., Tong, B. Modeling with Mixtures of Linear Regressions. Statistics and Computing 12, 315–330 (2002). https://doi.org/10.1023/A:1020779827503

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020779827503

Navigation