Skip to main content

A Self-learning Clustering Algorithm Based on Clustering Coefficient

  • Conference paper
  • 1581 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8786))

Abstract

This paper presents a novel clustering algorithm based on clustering coefficient. It includes two steps: First, k-nearest-neighbor method and correlation convergence are employed for a preliminary clustering. Then, the results are further split and merged according to intra-class and inter-class concentration degree based on clustering coefficient. The proposed method takes correlation between each other in a cluster into account, thereby improving the weakness existed in previous methods that consider only the correlation with center or core data element. Experiments show that our algorithm performs better in clustering compact data elements as well as forming some irregular shape clusters. It is more suitable for applications with little prior knowledge, e.g. hotspots discovery.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sarwar, B.M., Karypis, G., Konstan, J., et al.: Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In: Proceedings of the Fifth International Conference on Computer and Information Technology (January 2002)

    Google Scholar 

  2. Roy, P.J., Stuart, J.M., Lund, J., et al.: Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans. Nature 418(6901), 975–979 (2002)

    Google Scholar 

  3. Momtazi, S., Sameti, H., Bahrani, M., et al.: A POS-based fuzzy word clustering algorithm for continuous speech recognition systems. In: 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, pp. 1–4. IEEE (2007)

    Google Scholar 

  4. Momtazi, S., Klakow, D.: A word clustering approach for language model-based sentence retrieval in question answering systems. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1911–1914. ACM (1914)

    Google Scholar 

  5. Yasukawa, M., Yokoo, H.: Term Clustering based on Lengths and Co-occurrences of Terms. ADCS 2009, 126 (2009)

    Google Scholar 

  6. Dhillon, I.S., Mallela, S., Kumar, R.: Enhanced word clustering for hierarchical text classification. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 191–200. ACM (2002)

    Google Scholar 

  7. Wu, Y.C., Yang, J.C.: A Weighted Cluster-based Chinese Text Categorization Approach: Incorporating with Word Clusters. In: 2012 IIAI International Conference on Advanced Applied Informatics (IIAIAAI), pp. 279–282. IEEE (2012)

    Google Scholar 

  8. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)

    Article  Google Scholar 

  9. Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  10. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31(8), 651–666 (2010)

    Article  Google Scholar 

  11. Ester, M., Kriegel, H.P., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)

    Google Scholar 

  12. Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications 36(2), 3336–3341 (2009)

    Article  Google Scholar 

  13. Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: SDM (2003)

    Google Scholar 

  14. Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. ACM SIGMOD Record 27(2), 73–84 (1998)

    Article  Google Scholar 

  15. Guha, S., Rastogi, R., Shim, K.: ROCK: A robust clustering algorithm for categorical attributes. In: Proceedings of the 15th International Conference on Data Engineering, pp. 512–521. IEEE (1999)

    Google Scholar 

  16. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)

    Article  Google Scholar 

  17. Liu, Y., Nan, W., Zheng, T.: Spectral clustering for Chinese word. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009, vol. 1, pp. 529–533. IEEE (2009)

    Google Scholar 

  18. Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Transactions on Neural Networks 11(3), 586–600 (2000)

    Article  Google Scholar 

  19. Soffer, S.N., Vázquez, A.: Network clustering coefficient without degree-correlation biases. Physical Review E 71(5), 057101 (2005)

    Google Scholar 

  20. Guest, P.G., Guest, P.G.: Numerical methods of curve fitting. Cambridge University Press (2012)

    Google Scholar 

  21. https://code.google.com/p/cx-extractor/

  22. https://github.com/ansjsun/ansj_seg

  23. Aizawa, A.: An information-theoretic perspective of tf–idf measures. Information Processing & Management 39(1), 45–65 (2003)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhong, M., Ding, Z., Sun, H., Wang, P. (2014). A Self-learning Clustering Algorithm Based on Clustering Coefficient. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8786. Springer, Cham. https://doi.org/10.1007/978-3-319-11749-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11749-2_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11748-5

  • Online ISBN: 978-3-319-11749-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics