Skip to main content

A New Clustering Algorithm Based on Probability

  • Conference paper
Intelligent Data analysis and its Applications, Volume II

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 298))

Abstract

Clustering is a hot topic of data mining. After studying the existing classical algorithm of clustering, this paper proposes a new clustering algorithm based on probability, and makes a new definition for clustering and outlier. According to the distribution characteristics of sample data, this algorithm determines the initial clustering center automatically. It also implements eliminating outliers in the process of clustering. The experiment results on IRIS show that this algorithm can clustering effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhai, D., et al.: K-means text clustering algorithm based on initial cluster centers selection according to maximum distance. Application Research of Computer 31(3), 713–715 (2014)

    Google Scholar 

  2. Xia, L.N., Jing, J.W.: SA-DBSCAN: A self-adaptive density-based clustering algorithm. Journal of the Graduate School of the Chinese Academy of Sciences 26(4), 530–538 (2009)

    Google Scholar 

  3. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: LeCam, L., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  4. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Post & Telecom Press, Beijing (2006)

    Google Scholar 

  5. Ester, M., Kriegel, H.P., Sander, J.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J.W., Fayyad, U.M. (eds.) Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Portland (1996)

    Google Scholar 

  6. Shen, H.: Probability and Statistics, 5th edn. Higher Education Press, Beijing (2011)

    Google Scholar 

  7. Yu, Y., Zhou, A.: An Improved Algorithm of DBSCAN. Computer Technology and Development 21(2), 30–33, 38 (2011)

    Google Scholar 

  8. Daszykowski, M., Walczak, B., Massart, D.L.: Looking for Natural Patterns In Data. Chemometrics and Intelligent Laboratory Systems 56(2), 83–92 (2001)

    Article  Google Scholar 

  9. Chen, S., He, Y.J., Zhen, M.G.: NPP-oriented intelligent diagnose. Nuclear Power Engineering and Technology (3), 20–24 (2003)

    Google Scholar 

  10. Center for Machine Learning and Intelligent Systems at the University of California, Irvine, http://archive.ics.uci.edu/ml/datasets/Iris

  11. Witten, I.H., Frank, E., Hall, M.A.: Data Mining Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhang Yue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Yue, Z., Chuansheng, Z. (2014). A New Clustering Algorithm Based on Probability. In: Pan, JS., Snasel, V., Corchado, E., Abraham, A., Wang, SL. (eds) Intelligent Data analysis and its Applications, Volume II. Advances in Intelligent Systems and Computing, vol 298. Springer, Cham. https://doi.org/10.1007/978-3-319-07773-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07773-4_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07772-7

  • Online ISBN: 978-3-319-07773-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics