Hierarchical K-Means Clustering Algorithm Based on Silhouette and Entropy

Dong, Wuzhou; Ren, JiaDong; Zhang, Dongmei

doi:10.1007/978-3-642-23881-9_45

Hierarchical K-Means Clustering Algorithm Based on Silhouette and Entropy

Wuzhou Dong²³,
JiaDong Ren²³ &
Dongmei Zhang²⁴

Conference paper

2466 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7002))

Abstract

Hierarchical K-means clustering is one of important clustering task in data mining. In order to address the problem that the time complexity of the existing HK algorithms is high and most of algorithms are sensitive to noise, a hierarchical K-means clustering algorithm based on silhouette and entropy(HKSE) is put forward. In HKSE, the optimal cluster number is obtained through calculating the improved silhouette of the dataset to be clustered, so that time complexity can be reduced from O(n2) to O(k × n). Entropy is introduced in the hierarchical clustering phase as the similarity measurement avoiding distance calculation in order to reduce outlier effect on the cluster quality. In the post processing phase, the outlier cluster is identified by computing the weighted distance between clusters. Experiment results show that HKSE is efficient in reducing time complexity and sensitivity to noise.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dong, F.Y., Liu, J.j., Liu, B.: Study on improved grey integrated clustering method and its application. In: IEEE International Conference on Grey Systems and Intelligent Services, pp. 702–707 (2009)
Google Scholar
Liu, L., Huang, L.H., Lai, M.Y.: Projective ART with buffers for the high dimensional space clustering and an application to discover stock. Associations Neurocomputing 72, 1283–1295 (2009)
Article Google Scholar
Li, M.J., Ng, M.K., Cheung, Y.M.: Agglomerative fuzzy K-Means clustering algorithm with selection of number of clusters. IEEE Transactions on Knowledge and Data Engineering 20, 1519–1534 (2008)
Article Google Scholar
Chen, L.F., Jiang, Q.S., Wang, S.R.: A Hierarchical Method for Determining the Number of Clusters. Journal of Software 19, 62–72 (2008)
Article MATH Google Scholar
Lin, C.R., Chen, M.S.: A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging. In: Inf. Conf. 8th ACM SIGKDD on Knowledge Discovery and Data Mining, pp. 582–587 (2002)
Google Scholar
Lin, C.R., Chen, M.S.: Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging. IEEE Transaction On Knowledge and Data Engineering 17, 145–159 (2005)
Article Google Scholar
Lamrous, S., Taileb, M.: Divisive Hierarchical K-Means. In: International Conference on Computional Intelligence for Modeling Control and Automation, and International Conference on Intelligent Agent, Web Technologies and Internet Commerce, pp. 18–23 (2006)
Google Scholar
Lu, J.F., Tang, J.B., Tang, Z.M.: Hierarchical initialization approach for K-Means clustering. Pattern Recognition Letters, 787–795 (2008)
Google Scholar
Chen, T.S., Tsai, T.H., Chen, Y.T.: A Combined K-means and Hierarchical Clustering Method for Improving the Clustering Efficiency of Microarray. In: Proceeding of 2005 International Symposition on Intelligence Signal Processing and Communication System, pp. 405–408 (2005)
Google Scholar
Li, W.C., Zhou, Y., Xia, S.X.: A Novel Clustering Algorithm Based on Hierarchical and K-means Clustering. In: Proceedings of the 26th Chinese Control Conference, pp. 605–609 (2007)
Google Scholar
Chen, B., Tai, P.C., Harrison, R.: Novel Hybrid Hierarchical-K-means Clustering Method (H-K-means) for Microarray Analysis. In: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference, pp. 105–108 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Wuzhou Dong & JiaDong Ren
Qinhuangdao Port CO., LTD, Qinhuangdao, China
Dongmei Zhang

Authors

Wuzhou Dong
View author publications
You can also search for this author in PubMed Google Scholar
JiaDong Ren
View author publications
You can also search for this author in PubMed Google Scholar
Dongmei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Business Information Technology, RMIT University, City Campus, 124 La Trobe Street, 3000, Melbourne, VIC, Australia
Hepu Deng
School of Electronics and Information, Tongji University, 201804, Shanghai, China
Duoqian Miao
School of Computer and Information Engineering, Shanghai University of Electric Power, 200090, Shanghai, China
Jingsheng Lei
Department of Business Administration, Caritas Institute of Higher Education, 18 Chui Ling Road, Tseung Kwan O, Hong Kong, China
Fu Lee Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, W., Ren, J., Zhang, D. (2011). Hierarchical K-Means Clustering Algorithm Based on Silhouette and Entropy. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7002. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23881-9_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-23881-9_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23880-2
Online ISBN: 978-3-642-23881-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics