Skip to main content

Determining the Number of Probability-Based Clustering: A Hybrid Approach

  • Conference paper
Content Computing (AWCC 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3309))

Included in the following conference series:

  • 444 Accesses

Abstract

While analyzing the previous methods for determining the number of probability-based clustering, this paper introduces an improved Monte Carlo Cross-Validation algorithm (iMCCV) and attempts to solve the posterior probabilities spread problem, which cannot be resolved by the Monte Carlo Cross-Validation algorithm. Furthermore, we present a hybrid approach to determine the number of probability-based clustering by combining the iMCCV algorithm and the parallel coordinates visual technology. The efficiency of our approach is discussed with experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  2. Cheeseman, P., Stutz, J.: Bayesian Classification (AutoClass): Theory and Results. In: Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI Press/MIT Press (1995)

    Google Scholar 

  3. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B(Methodological) 39(1), 1–38

    Google Scholar 

  4. Fraley, C., Raftery, A.E.: How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Computer Journal 41, 578–588 (1998)

    Article  MATH  Google Scholar 

  5. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. Massachusetts Institute of Technology (2001)

    Google Scholar 

  6. Inselberg, A., Dimsdale, B.: Parallel Coordinates: A Tool for Visualizing Multidimensional Geometry. In: Proceedings of the First conference on Visualization, San Francisco, California, pp. 361–378 (1990)

    Google Scholar 

  7. Smyth, P.: Clustering using Monte Carlo Cross-Validation. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 126–133. AAAI Press, Menlo Park (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dai, T., Li, C., Sun, J. (2004). Determining the Number of Probability-Based Clustering: A Hybrid Approach. In: Chi, CH., Lam, KY. (eds) Content Computing. AWCC 2004. Lecture Notes in Computer Science, vol 3309. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30483-8_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30483-8_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23898-0

  • Online ISBN: 978-3-540-30483-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics