research-article

The Systematic Review of K-Means Clustering Algorithm

Authors:
Ardavan Ashabi

University of Technology Malaysia, MALAYSIA

University of Technology Malaysia, MALAYSIA
View Profile

,
Shamsul Bin Sahibuddin

Razak Faculty of Technology and Informatics University of Technology Malaysia Kuala Lumpur MALAYSIA, Malaysia

Razak Faculty of Technology and Informatics University of Technology Malaysia Kuala Lumpur MALAYSIA, Malaysia
View Profile

,
Mehdi Salkhordeh Haghighi

Faculty of Computer Engineering Sadjad University of Technology Mashhad IRAN, Iran

Faculty of Computer Engineering Sadjad University of Technology Mashhad IRAN, Iran
View Profile

ICNCC '20: Proceedings of the 2020 9th International Conference on Networks, Communication and ComputingDecember 2020Pages 13–18https://doi.org/10.1145/3447654.3447657

Published:13 May 2021Publication History

ICNCC '20: Proceedings of the 2020 9th International Conference on Networks, Communication and Computing

Pages 13–18

ABSTRACT

Recently, the world is experiencing generating the huge amount of data in different domains. Data mining and Data analytics and are the practices used for analyzing data and extracting hidden knowledge. One of the major data mining methods which is used to analysis of data is data clustering. Data clustering ease the extract information from each cluster separately. There are many algorithms used to perform the clustering. One of the most famous algorithms which is used for clustering for more than half a century is k-means. By many optimization and enhancement, K-means still considers as the most popular clustering algorithm which is still being used in various domains. This research attempts to conduct a Systematic literature Review (SLR) to collect, classify, and analyze the primary studies about the different version of k-means clustering algorithm. This SLR gives a means of finding, appraising, and interpreting existing researches pertinent to the topic. By narrowing down the crucial sections of debate, we are hoping to establish a foundation for upcoming researches.

References

N. Ghadiri, M. Ghaffari, and M. A. Nikbakht, “BigFCM: Fast, precise and scalable FCM on hadoop,” Futur. Gener. Comput. Syst., vol. 77, pp. 29–39, 2017.Google ScholarDigital Library
K. Rezaei and H. Rezaei, “HFSMOOK-Means: An Improved K-Means Algorithm Using Hesitant Fuzzy Sets and Multi-objective Optimization,” Arab. J. Sci. Eng., vol. 45, no. 8, pp. 6241–6257, 2020.Google ScholarCross Ref
J. J. D. Cabrera, A. M. Sison, and R. P. Medina, “Centroid 360: An enhanced centroid initialization method for K means algorithm,” ACM Int. Conf. Proceeding Ser., pp. 230–235, 2019.Google ScholarDigital Library
B. A. Kitchenham and S. Charters, “Guidelines for performing Systematic Literature Reviews in Software Engineering,” Citeseer, 2007.Google Scholar
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proc. Fifth Berkeley Symp. Math. Stat. Probab., vol. 1, no. 233, pp. 281–297, 1967.Google Scholar
A. S. Shirkhorshidi and S. Aghabozorgi, “Big Data Clustering: A Review,” Comput. Sci. Its Appl. – ICCSA 2014, vol. 8583, no. June, 2014.Google Scholar
R. Jothi, S. K. Mohanty, and A. Ojha, “DK-means: a deterministic K-means clustering algorithm for gene expression analysis,” Pattern Anal. Appl., vol. 22, no. 2, pp. 649–667, 2019.Google ScholarDigital Library
A. M. El-Mandouh, H. A. Mahmoud, L. A. Abd-Elmegid, and M. H. Haggag, “Optimized K-means clustering model based on gap statistic,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 1, pp. 183–188, 2019.Google Scholar
G. Laccetti, M. Lapegna, V. Mele, D. Romano, and L. Szustak, “Performance enhancement of a dynamic K-means algorithm through a parallel adaptive strategy on multicore CPUs,” J. Parallel Distrib. Comput., vol. 145, pp. 34–41, 2020.Google ScholarCross Ref
T. G. Debelee, F. Schwenker, S. Rahimeto, and D. Yohannes, “Evaluation of modified adaptive k-means segmentation algorithm,” Comput. Vis. Media, vol. 5, no. 4, pp. 347–361, 2019.Google ScholarCross Ref
G. V. Oliveira, F. P. Coutinho, R. J. G. B. Campello, and M. C. Naldi, “Improving k-means through distributed scalable metaheuristics,” Neurocomputing, vol. 246, pp. 45–57, 2017.Google ScholarDigital Library
W. Lu, “Improved K-Means Clustering Algorithm for Big Data Mining under Hadoop Parallel Framework,” J. Grid Comput., vol. 18, no. 2, pp. 239–250, 2020.Google ScholarDigital Library
Y. L. Zhang and Y. N. Wang, “An improved sampling K-means clustering algorithm based on MapReduce,” ICNC-FSKD 2017 - 13th Int. Conf. Nat. Comput. Fuzzy Syst. Knowl. Discov., pp. 1934–1939, 2018.Google Scholar
S. Khanmohammadi, N. Adibeig, and S. Shanehbandy, “An improved overlapping k-means clustering method for medical applications,” Expert Syst. Appl., vol. 67, pp. 12–18, 2017.Google ScholarDigital Library
M. Sivaguru and M. Punniyamoorthy, “Performance-enhanced rough k -means clustering algorithm,” Soft Comput., vol. 2, 2020.Google ScholarDigital Library
J. Qi, Y. Yu, L. Wang, and J. Liu, “K∗-means: An effective and efficient k-means clustering algorithm,” Proc. - 2016 IEEE Int. Conf. Big Data Cloud Comput. BDCloud 2016, Soc. Comput. Networking, Soc. 2016 Sustain. Comput. Commun. Sustain. 2016, pp. 242–249, 2016.Google ScholarCross Ref
Y. Xiong, Q. Peng, and Z. Zhang, “Research on MapReduce parallel optimization method based on improved K-means clustering algorithm,” ACM Int. Conf. Proceeding Ser.,, 2020.Google ScholarDigital Library
L. Zhang, J. Qu, M. Gao, and M. Zhao, “Improvement of k-means algorithm based on density,” Proc. 2019 IEEE 8th Jt. Int. Inf. Technol. Artif. Intell. Conf. ITAIC 2019, no. Itaic, pp. 1070–1073, 2019.Google ScholarCross Ref
S. S. Yu, S. W. Chu, C. M. Wang, Y. K. Chan, and T. C. Chang, “Two improved k-means algorithms,” Appl. Soft Comput. J., vol. 68, pp. 747–755, 2018.Google ScholarDigital Library
T. Wang and J. Gao, “An Improved K-Means Algorithm Based on Kurtosis Test,” J. Phys. Conf. Ser., vol. 1267, no. 1, 2019.Google ScholarCross Ref
X. Wang and Y. Bai, “The global Minmax k-means algorithm,” Springerplus, vol. 5, no. 1, 2016.Google Scholar
C. Lutz, S. Breb, T. Rabl, S. Zeuch, and V. Mark, “Efficient and Scalable k‑Means on GPUs,” Datenbank Spektrum, pp. 157–169, 2018.Google ScholarCross Ref
C. Sreedhar, N. Kasiviswanath, and P. Chenna Reddy, “Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop,” J. Big Data, vol. 4, no. 1, 2017.Google ScholarCross Ref
R. M. Esteves, T. Hacker, and C. Rong, “Competitive K-means: A new accurate and distributed K-means algorithm for large datasets,” Proc. Int. Conf. Cloud Comput. Technol. Sci. CloudCom, vol. 1, pp. 17–24, 2013.Google ScholarDigital Library
B. Xiao, Z. Wang, Q. Liu, and X. Liu, “SMK-means: An improved mini batch k-means algorithm based on mapreduce with big data,” Comput. Mater. Contin., vol. 56, no. 3, pp. 365–379, 2018.Google Scholar
G. Zhang, C. Zhang, and H. Zhang, “Improved K-means algorithm based on density Canopy,” Knowledge-Based Syst., vol. 145, pp. 289–297, 2018.Google ScholarDigital Library
M. Ashkartizabi and M. Aminghafari, “Functional data clustering using K-means and random projection with applications to climatological data,” Stoch. Environ. Res. Risk Assess., vol. 32, no. 1, pp. 83–104, 2018.Google ScholarCross Ref
S. Y. Huang and B. Zhang, “Research on improved k-means clustering algorithm based on hadoop platform,” Proc. - 2019 Int. Conf. Mach. Learn. Big Data Bus. Intell. MLBDBI 2019, pp. 301–303, 2019.Google ScholarCross Ref
R. A. Haraty, M. Dimishkieh, and M. Masud, “An Enhanced k-Means Clustering Algorithm for Pattern Discovery in Healthcare Data,” Int. J. Distrib. Sens. Networks, vol. 11, no. 6, p. 615740, 2015.Google ScholarDigital Library
X. Hou, “An Improved K-means Clustering Algorithm Based on Hadoop Platform,” Advances in Intelligent Systems and Computing, vol. 928. pp. 1101–1109, 2020.Google ScholarCross Ref
S. Dhanasekaran, R. Sundarrajan, B. S. Murugan, S. Kalaivani, and V. Vasudevan, “Enhanced Map Reduce Techniques for Big Data Analytics based on K-Means Clustering,” IEEE Int. Conf. Intell. Tech. Control. Optim. Signal Process. INCOS 2019, pp. 0–4, 2019.Google ScholarCross Ref
X. Wei and Y. Li, “Research on improved k-means algorithm based on hadoop,” Proc. - 2017 4th Int. Conf. Inf. Sci. Control Eng. ICISCE 2017, pp. 593–598, 2017.Google Scholar
K. Wu, W. Zeng, T. Wu, and Y. An, “Research and improve on K-means algorithm based on hadoop,” Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, vol. 2015-Novem, pp. 334–337, 2015.Google ScholarCross Ref

Recommendations

Improvement in k-Means Clustering Algorithm Using Data Clustering
ICCUBEA '15: Proceedings of the 2015 International Conference on Computing Communication Control and Automation

The set of objects having same characteristics are organized in groups and clusters of these objects reformed known as Data Clustering. It is an unsupervisedlearning technique for classification of data. K-means algorithm is widely used and famous ...
Read More
Ensemble-Initialized k-Means Clustering
ICMLC '19: Proceedings of the 2019 11th International Conference on Machine Learning and Computing

As one of the most classical clustering techniques, the k-means clustering has been widely used in various areas over the past few decades. Despite its significant success, there are still several challenging issues in the k-means clustering research, ...
Read More
The Projected Dip-means Clustering Algorithm
SETN '18: Proceedings of the 10th Hellenic Conference on Artificial Intelligence

One of the major research issues in data clustering concerns the estimation of number of clusters. In previous work, the dip-means clustering algorithm has been proposed as a successful attempt to tackle this problem. Dip-means is an incremental ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICNCC '20: Proceedings of the 2020 9th International Conference on Networks, Communication and Computing
December 2020
157 pages
ISBN:9781450388566
DOI:10.1145/3447654

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data Clustering
Data Mining
K-means
SLR
Systematic literature Review
Unsupervised Learning
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 521
  Total Downloads
- Downloads (Last 12 months)194
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Systematic Review of K-Means Clustering Algorithm

ICNCC '20: Proceedings of the 2020 9th International Conference on Networks, Communication and Computing

ABSTRACT

References

Cited By

Recommendations

Improvement in k-Means Clustering Algorithm Using Data Clustering

Ensemble-Initialized k-Means Clustering

The Projected Dip-means Clustering Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

The Systematic Review of K-Means Clustering Algorithm

ICNCC '20: Proceedings of the 2020 9th International Conference on Networks, Communication and Computing

ABSTRACT

References

Cited By

Recommendations

Improvement in k-Means Clustering Algorithm Using Data Clustering

Ensemble-Initialized k-Means Clustering

The Projected Dip-means Clustering Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media