A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering

Qian, Guoqi; Wu, Yuehua; Shao, Qing

doi:10.1007/s00357-009-9035-y

A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering

Published: 04 July 2009

Volume 26, pages 183–199, (2009)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Guoqi Qian¹,
Yuehua Wu² &
Qing Shao³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

This paper studies the problem of estimating the number of clusters in the context of logistic regression clustering. The classification likelihood approach is employed to tackle this problem. A model-selection based criterion for selecting the number of logistic curves is proposed and its asymptotic property is also considered. The small sample performance of the proposed criterion is studied by Monto Carlo simulation. In addition, a real data example is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation and Classification Using Samples from Two Logistic Populations with a Common Scale Parameter

Covariance matrix estimation of the maximum likelihood estimator in multivariate clusterwise linear regression

Article 18 May 2020

Minimum Rényi Pseudodistance Estimators for Logistic Regression Models

References

AKAIKE, H. (1973), “Information Theory and an Extension of the Maximum Likelihood Principle”, in Proceedings of the Second International Symposium on Information Theory, eds. B.N. Petrov and F. Csáki, Budapest: Akadémia Kiadó, pp. 267–281.
Google Scholar
AKAIKE, H. (1978), “A Bayesian Analysis of the Minimum AIC Procedure”, Annals of the Institute of Statistical Mathematics, 30, 9–14.
Article MATH MathSciNet Google Scholar
BAI, Z.D., RAO, C.R., and WU, Y. (1999), “Model Selection with Data-oriented Penalty”, Journal of Statistical Planning and Inference, 77, 103–117.
Article MATH MathSciNet Google Scholar
BIERNACKI, C., CELEUX,G., and GOVAERT,G. (2000), “Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 719–725.
Article Google Scholar
BOCK, H.H. (1969), “The Equivalence of Two Extremal Problems and Its Application to the Iterative Classification of Multivariate Data”, Manuscript for the Conference “Medizinische Statistik”, Forschungsinstitut Oberworfach.
BOCK, H.H. (1996), “Probability Models and Hypotheses Testing in Partitioning Cluster Analysis”, in Clustering and Classification, eds. P. Arabie, L.J. Hubert, and G. De Soete, River Edge, New Jersey: World Scientific Publishing, pp. 377–453.
Google Scholar
COLLETT, D. (2003), “Modelling Binary Data” (2nd ed.), Boca Raton, FL: Chapman and Hall/CRC.
MATH Google Scholar
FAREWELL, B.T., and SPROTT, D. (1988), “The Use of a Mixture Model in the Analysis of Count Data”, Biometrics, 44, 1191–1194.
Article MATH Google Scholar
FOLLMANN, D.A., and LAMBERT, D. (1989), “Generalizing Logistic Regression by Nonparametric Mixing”, Journal of the American Statistical Association, 84, 295–300.
Article Google Scholar
FOLLMANN, D.A., and LAMBERT, D. (1991), “Identifiability for Nonparametric Mixtures of Logistic Regressions”, Journal of Statistical Planning and Inference, 27, 375–381.
Article MATH MathSciNet Google Scholar
HANNAN, E.J., and QUINN, B.G. (1979), “The Determination of the Order of an Autoregression”, Journal of Royal Statistical Society, Series B, 41, 190–195.
MATH MathSciNet Google Scholar
HEWLETT, P.S., and PLACKETT, R.L. (1950), “Statistical Aspects of the Independent Joint Action of Poisons, Particularly Insecticides. II. Examination of Data for Agreement with the Hypothesis”, Annals of Applied Biology, 37, 527–552.
Article Google Scholar
HURVICH, C.M., and TSAI, C.L. (1989), “Regression and Time Series Model Selection in ples”, Biometrika, 76, 297–307.
Article MATH MathSciNet Google Scholar
MCCULLAGH, P., and NELDER, J.A. (1989), “Generalized Linear Models” (2nd ed.), London: Chapman and Hall.
MATH Google Scholar
NAIK, P.A., SHI, P., and TSAI, C.L. (2007), “Extending the Akaike Information Criterion to Mixture Regression Models”, Journal of the American Statistical Association, 102, 244–254.
Article MATH MathSciNet Google Scholar
QIAN, G., and FIELD, C. (2002), “Law of Iterated Logarithm and Consistent Model Selection Criterion in Logistic Regression”, Statistics & Probability Letters, 56, 101–112.
Article MATH MathSciNet Google Scholar
QIAN, G., and KÜNSCH, H. (1998), “On Model Selection via Stochastic Complexity in Robust Linear Regression”, Journal of Statistical Planning and Inference, 75, 91–116.
Article MATH MathSciNet Google Scholar
SHAO, Q., and WU, Y. (2005), “A Consistent Procedure for Determining the Number of Clusters in Regression Clustering”, Journal of Statistical Planning and Inference, 135, 461–476.
Article MATH MathSciNet Google Scholar
SPÄTH, H. (1979), “Clusterwise Linear Regression”, Computing, 22, 367–373.
Article MATH MathSciNet Google Scholar
SPÄTH, H. (1982), “Algorithm 48: A Fast Algorithm for Clusterwise Linear Regression”, Computing, 29, 175–181.
Article MATH Google Scholar
SCHWARZ, G. (1978), “Estimating the Dimension of a Model”, Annals of Statistics, 6, 461–464.
Article MATH MathSciNet Google Scholar
WEDEL, M., and DESARBO,W.S. (1995), “A Mixture Likelihood Approach for Generalized Linear Models”, Journal of Classification, 12, 21–55.
Article MATH Google Scholar
WU, Y., and ZEN, M.M. (1999), “A Strong Consistent Information Criterion for Linear Model Selection Based on M-estimation”, Probability Theory and Related Fields, 113, 599–625.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, The University of Melbourne, Melbourne, VIC, 3010, Australia
Guoqi Qian
Department of Mathematics and Statistics, York University, Toronto, ON, M3J 1P3, Canada
Yuehua Wu
Biostatistics and Statistical Reporting, One Health Plaza, Bldg. 435–4173, Novartis Pharmaceuticals Corporation, East Hanover, NJ, 07936, USA
Qing Shao

Authors

Guoqi Qian
View author publications
You can also search for this author in PubMed Google Scholar
Yuehua Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qing Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoqi Qian.

Additional information

The authors would like to thank the editor, Prof. Willem J. Heiser, and the anonymous referees for the valuable comments and suggestions, which have led to the improvement of this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, G., Wu, Y. & Shao, Q. A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering. J Classif 26, 183–199 (2009). https://doi.org/10.1007/s00357-009-9035-y

Download citation

Published: 04 July 2009
Issue Date: August 2009
DOI: https://doi.org/10.1007/s00357-009-9035-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Estimation and Classification Using Samples from Two Logistic Populations with a Common Scale Parameter

Covariance matrix estimation of the maximum likelihood estimator in multivariate clusterwise linear regression

Minimum Rényi Pseudodistance Estimators for Logistic Regression Models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Estimation and Classification Using Samples from Two Logistic Populations with a Common Scale Parameter

Covariance matrix estimation of the maximum likelihood estimator in multivariate clusterwise linear regression

Minimum Rényi Pseudodistance Estimators for Logistic Regression Models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation