research-article

K-means Optimization Method Based On Adaptive Parallel Hierarchical Clustering

Author:
Xinchen Ma

North China University of Technology, 100144, CHINA, China

North China University of Technology, 100144, CHINA, China

0009-0002-9380-5598
View Profile

FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine LearningApril 2023Pages 85–91https://doi.org/10.1145/3616901.3616922

Published:05 March 2024Publication History

FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine Learning

Pages 85–91

ABSTRACT

The two key steps of the K-means algorithm are the selection of the clustering number and the selection of the initial clustering center, which will seriously affect the classification accuracy and efficiency of K-means, and need further optimization. Aiming at the selection of the number of clusters, a K-means optimization method based on adaptive parallel hierarchical clustering is proposed. In the merging process of hierarchical clustering, the optimal number of clusters is selected adaptively by improving the clustering effect evaluation function, and the Parallel computing method is used instead of the serial computing method to improve the computing speed. Aiming at selecting cluster centers more accurately, an optimized data density model is proposed to make full use of potentially related information between samples, which improves the classification accuracy of the algorithm. More importantly, it overcomes the problem of the strong subjectivity of super-parameter selection. The improved algorithm was tested with the ablation experiment method and compared to other traditional algorithms on iris and seed data sets. The results showed that the optimization algorithm could accelerate the calculation speed and improve the classification accuracy.

References

Li Peng. Research on Hierarchical K-means based clustering algorithm [D]. Harbin: Harbin Engineering University, 2015.Google Scholar
SHI Xiaoyu, TANG Xiaoyu, WANG Xiaoli, SUN Yaming, QI Zixuan, ZHANG Yanxin. Cluster Analysis of Dairy Consumption Preference in Hebei Province Based on K-means Clustering [J]. Journal of Hebei Agricultural Sciences, 2021, 25 (2): 29-33.Google Scholar
Pelleg, Dan & Moore, Andrew. (2002). X-means: Extending K-means with Efficient Estimation of the Number of Clusters. Machine Learning, p.Google Scholar
Redmond S J, Heneghan C.A Method for Initialising the K − means Clustering Algorithm Using Kd − trees[J]. Pattern Recognition Letters, 2007, 28( 8) : 965 − 973.Google ScholarDigital Library
Jia Ruiyu, Song Jianlin. K-means Optimal Clustering Number Determination MethodBased on Clustering Center Optimization. MICROELECTRONICS &COMPUTER, 2016, 33(5): 62-66.Google Scholar
WANG S,LIU C,XING S J. Review on K-means clustering algorithm[J]. Journal of East China Jiaotong University, 2022, 39(5): 119-126.Google Scholar
LI Y S, YANG S L, MA X J, HU X X, CHEN Z M. Optimization Study on K Value of Spatial Clustering[J]. Journal of System Simulation, 2006, 18(3): 573-576.Google Scholar
He Xuansen, He Fan, Xu Li, Fan Yueping. Determination of the Optimal Number of Clusters in K-Means Algorithm[J]. Journal of University of Electronic Science and Technology of China, 2022,51(6): 904 – 912.Google Scholar
Li Chunfang, Zhang Ruifeng, Jia Lu, Wang Fang, Guo Fei. A new electricity stealing identification model and simulation based on improved k-means algorithm and big data analysis [J].Electronic Design Engineering, 2022, 30(22) : 84-88.Google Scholar
WANG Zhong, LIU Gui-Quan, CHEN En-Hong. A K-means Algorithm Based on Optimized Initial Center Points. PR&AI, 2009, 22(2): 299−303.Google Scholar
Jones D R,Beltramo M A.Solving Partitioning Problems with Genetic Algorithms[C]. In: Proceedings of the 4th International Conference Genetic Algorithms, San Diego,CA,USA. 1991: 442 − 494.Google Scholar
Lai Yuxia, Liu Jianping, Yang Guoxing. K-Means Clustering Analysis Based on Genetic Algorithm[J]. Computer Engineering, 2008, 34(20):200-202.Google Scholar
Zhang Chao. K-means Clustering Center Selection [J]. Journal of Jilin University, 2019, 37(4):437-441Google Scholar
Tao Yonghui, Wang Yong. Improved K-means algorithm based on the selection of initial clustering center [J].theories and methods, 2022,41(9):54 – 59.Google Scholar
Sun Lin, Liu Menghan, Xu Jiucheng .K-means Clustering Algorithm Using Optimal Initial Clustering Center and Contour Coefficie. Fuzzy Systems and Mathematics, 2022, 36(1):47-64.Google Scholar
Novoselsky, Alexander & Kagan, Eugene. (2021). An introduction to cluster analysis. 10.13140/RG.2.2.25993.57448/1.Google Scholar
HAN Ling-bo,WANG Qiang,JIANG Zheng-feng,et al.Improved k-means initial clustering center selection algorithm. Computer Engineering and Applications,2010,46(17):150-152.Google Scholar
X. Wu, Z. Chen, S. Yuan, J. Wei and X. Wang, "An improved k-means algorithm based on density normalization," 2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 2021, pp. 1141-1146, doi: 10.1109/ICIBA52610.2021.9687899.Google ScholarCross Ref
Mitra P, Murthy C A, Pal S K. Density-based multiscale data condensation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(6): 734-747.Google ScholarDigital Library
CAI Yuhao, LIANG Yongquan, FAN Jiancong, LI Xuan, LIU Wenhua . Optimizing Initial Cluster Centroids by Weighted Local Variance in K-means Algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(5): 732-741.Google Scholar
Rezaee M R, Lelieveldt B P, Reiber J H. A New Cluster Validity Index for the Fuzzy C-Means[J].Pattern Recognition Letters,1998, 19( 3 − 4) : 237 − 246.Google Scholar
Chen Yin, He Zhongshi . The study on improved K-means algorithm[J]. Manufacturing automation, 2012, 34(4):19-22.Google Scholar
Huang He, Xiong Wu, Wu Kun, Wang Huifeng. K-means Hybrid Iterative Clustering Basedon Memory Transfer Sail fish Optimization.JOURNAL OF SHANGHAIJIAOTONG UNIVERSITY, 2022, 56(12) :1638-1648.Google Scholar

Index Terms

K-means Optimization Method Based On Adaptive Parallel Hierarchical Clustering
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
  2. Parallel computing methodologies
2. Theory of computation
  1. Design and analysis of algorithms

Index terms have been assigned to the content through auto-classification.

Recommendations

Hierarchical Means Clustering
Abstract
In the cluster analysis literature, there are several partitioning (non-hierarchical) methods for clustering multivariate objects based on model estimation. Distinct to these methods is the use of a system of n nested statistical models and the ...
Read More
Survey of Clustering: Algorithms and Applications

This article is a survey into clustering applications and algorithms. A number of important well-known clustering methods are discussed. The authors present a brief history of the development of the field of clustering, discuss various types of ...
Read More
An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization

Clustering is the process of grouping data objects into set of disjoint classes called clusters so that objects within a class are highly similar with one another and dissimilar with the objects in other classes. K-means (KM) algorithm is one of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine Learning
April 2023
296 pages
ISBN:9798400707544
DOI:10.1145/3616901

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 March 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Adaption
Data density
Hierarchical clustering
K-means
Parallel computing
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 15
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

K-means Optimization Method Based On Adaptive Parallel Hierarchical Clustering

FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hierarchical Means Clustering

Survey of Clustering: Algorithms and Applications

An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

K-means Optimization Method Based On Adaptive Parallel Hierarchical Clustering

FAIML '23: Proceedings of the 2023 International Conference on Frontiers of Artificial Intelligence and Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hierarchical Means Clustering

Survey of Clustering: Algorithms and Applications

An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media