Skip to main content

Estimating the Number of Clusters Using Multivariate Location Test Statistics

  • Conference paper
Fuzzy Systems and Knowledge Discovery (FSKD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4223))

Included in the following conference series:

Abstract

In the cluster analysis, to determine the unknown number of clusters we use a criterion based on a classical location test statistic, Hotelling’s T 2. At each clustering level, its theoretical threshold is studied in view of its statistical distribution and a multiple comparison problem. In order to examine its performance, extensive experiments are done with synthetic data generated from multivariate normal distributions and a set of real image data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benjamin, Y., Hochberg, Y.: Controlling the False Discovery Rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57(1), 289–300 (1995)

    Google Scholar 

  2. Choi, K., Jun, C.: A systematic approach to the Kansei factors of tactile sense regarding the surface roughness. Applied Economics (in press, 2006)

    Google Scholar 

  3. Duda, R.D., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, Inc., New York (2001)

    MATH  Google Scholar 

  4. Gallegos, M.T.: Maximum likelihood clustering with outliers. In: Jajuga, et al. (eds.) Classification, Clustering, and Data Analysis, Springer, Heidelberg (2002)

    Google Scholar 

  5. Gordon, A.: Classification, 2nd edn. Chapman and Hall-CRC, London (1999)

    MATH  Google Scholar 

  6. Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. In: Data Mining, Inference, and Prediction, Springer, Heidelberg (2001)

    Google Scholar 

  7. Hotelling, H.: Multivariate Quality Control. In: Eisenhart, C., Hastay, M.W., Wallis, W.A. (eds.) Techniques of Statistical Analysis, McGraw-Hill, New York (1947)

    Google Scholar 

  8. Ihaka, R., Gentleman, R.: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5(3), 299–314 (1996)

    Article  Google Scholar 

  9. Jajuga, K., Sokolowski, A., Bock, H.-H. (eds.): Classification, Clustering, and Data Analysis. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  10. Kim, D.H., Chung, C.W., Barnard, K.: Relevance Feedback using Adaptive Clustering for Image Similarity Retrieval. The Journal of Systems and Software 78, 9–23 (2005)

    Article  Google Scholar 

  11. Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, London (1979)

    MATH  Google Scholar 

  12. Miligan, G.W., Cooper, M.C.: An examination of procedure for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)

    Article  Google Scholar 

  13. Mojena, R.: Hierarchical grouping methods and stopping rules: An evaluation. The Computer journal 20(4) (1975)

    Google Scholar 

  14. Rencher, A.C.: Methods of Multivariate Analysis. John Wiley and Sons, Chichester (2002)

    Book  MATH  Google Scholar 

  15. Rousseeuw, P.J., Van Driessen, K.: A first algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)

    Article  Google Scholar 

  16. Tibsirani, R., Walther, G., Hasite, T.: Estimating the number of clusters in a data set via the gap statistic. J.R. Statist. Soc. B 63, 411–423 (2001)

    Article  Google Scholar 

  17. Ward, J.H.: Hierarchical Grouping to optimize an objective function. J. of Amer. Stat. Assoc. 58, 236–244 (1963)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Choi, K., Kim, DH., Choi, T. (2006). Estimating the Number of Clusters Using Multivariate Location Test Statistics. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_43

Download citation

  • DOI: https://doi.org/10.1007/11881599_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45916-3

  • Online ISBN: 978-3-540-45917-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics