A Self-learning Clustering Algorithm Based on Clustering Coefficient

Zhong, MingJie; Ding, ZhiJun; Sun, HaiChun; Wang, PengWei

doi:10.1007/978-3-319-11749-2_6

MingJie Zhong¹⁹,
ZhiJun Ding¹⁹,
HaiChun Sun¹⁹ &
…
PengWei Wang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8786))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1616 Accesses

Abstract

This paper presents a novel clustering algorithm based on clustering coefficient. It includes two steps: First, k-nearest-neighbor method and correlation convergence are employed for a preliminary clustering. Then, the results are further split and merged according to intra-class and inter-class concentration degree based on clustering coefficient. The proposed method takes correlation between each other in a cluster into account, thereby improving the weakness existed in previous methods that consider only the correlation with center or core data element. Experiments show that our algorithm performs better in clustering compact data elements as well as forming some irregular shape clusters. It is more suitable for applications with little prior knowledge, e.g. hotspots discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A neighborhood-based three-stage hierarchical clustering algorithm

Article 29 July 2021

VDENCLUE: An Enhanced Variant of DENCLUE Algorithm

A Fast Heuristic k-means Algorithm Based on Nearest Neighbor Information

References

Sarwar, B.M., Karypis, G., Konstan, J., et al.: Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In: Proceedings of the Fifth International Conference on Computer and Information Technology (January 2002)
Google Scholar
Roy, P.J., Stuart, J.M., Lund, J., et al.: Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans. Nature 418(6901), 975–979 (2002)
Google Scholar
Momtazi, S., Sameti, H., Bahrani, M., et al.: A POS-based fuzzy word clustering algorithm for continuous speech recognition systems. In: 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, pp. 1–4. IEEE (2007)
Google Scholar
Momtazi, S., Klakow, D.: A word clustering approach for language model-based sentence retrieval in question answering systems. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1911–1914. ACM (1914)
Google Scholar
Yasukawa, M., Yokoo, H.: Term Clustering based on Lengths and Co-occurrences of Terms. ADCS 2009, 126 (2009)
Google Scholar
Dhillon, I.S., Mallela, S., Kumar, R.: Enhanced word clustering for hierarchical text classification. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 191–200. ACM (2002)
Google Scholar
Wu, Y.C., Yang, J.C.: A Weighted Cluster-based Chinese Text Categorization Approach: Incorporating with Word Clusters. In: 2012 IIAI International Conference on Advanced Applied Informatics (IIAIAAI), pp. 279–282. IEEE (2012)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)
Article Google Scholar
Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)
Article Google Scholar
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31(8), 651–666 (2010)
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Google Scholar
Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications 36(2), 3336–3341 (2009)
Article Google Scholar
Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: SDM (2003)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. ACM SIGMOD Record 27(2), 73–84 (1998)
Article Google Scholar
Guha, S., Rastogi, R., Shim, K.: ROCK: A robust clustering algorithm for categorical attributes. In: Proceedings of the 15th International Conference on Data Engineering, pp. 512–521. IEEE (1999)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Article Google Scholar
Liu, Y., Nan, W., Zheng, T.: Spectral clustering for Chinese word. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009, vol. 1, pp. 529–533. IEEE (2009)
Google Scholar
Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Transactions on Neural Networks 11(3), 586–600 (2000)
Article Google Scholar
Soffer, S.N., Vázquez, A.: Network clustering coefficient without degree-correlation biases. Physical Review E 71(5), 057101 (2005)
Google Scholar
Guest, P.G., Guest, P.G.: Numerical methods of curve fitting. Cambridge University Press (2012)
Google Scholar
https://code.google.com/p/cx-extractor/
https://github.com/ansjsun/ansj_seg
Aizawa, A.: An information-theoretic perspective of tf–idf measures. Information Processing & Management 39(1), 45–65 (2003)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

The Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai, 200092, China
MingJie Zhong, ZhiJun Ding & HaiChun Sun
Department of Computer Science, University of Pisa, Pisa, 56127, Italy
PengWei Wang

Authors

MingJie Zhong
View author publications
You can also search for this author in PubMed Google Scholar
ZhiJun Ding
View author publications
You can also search for this author in PubMed Google Scholar
HaiChun Sun
View author publications
You can also search for this author in PubMed Google Scholar
PengWei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of New South Wales, Sydney, Australia
Boualem Benatallah
Boston University, Boston, MA, USA
Azer Bestavros
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos & Athena Vakali &
Victoria University, Footscray, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, M., Ding, Z., Sun, H., Wang, P. (2014). A Self-learning Clustering Algorithm Based on Clustering Coefficient. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8786. Springer, Cham. https://doi.org/10.1007/978-3-319-11749-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-11749-2_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11748-5
Online ISBN: 978-3-319-11749-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Self-learning Clustering Algorithm Based on Clustering Coefficient

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A neighborhood-based three-stage hierarchical clustering algorithm

VDENCLUE: An Enhanced Variant of DENCLUE Algorithm

A Fast Heuristic k-means Algorithm Based on Nearest Neighbor Information

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Self-learning Clustering Algorithm Based on Clustering Coefficient

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A neighborhood-based three-stage hierarchical clustering algorithm

VDENCLUE: An Enhanced Variant of DENCLUE Algorithm

A Fast Heuristic k-means Algorithm Based on Nearest Neighbor Information

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation