Abstract
Cluster analysis has been widely used in pattern recognition, image segmentation, document clustering, intrusion detection, market research, and so on. There have been many clustering algorithms for cluster analysis now. However, most of them are unsuitable for complex patterns with large variations in density and manifold structure. Also, they are not robust to noises. To overcome the above deficiencies, we propose a novel clustering algorithm (called DCLGMS) based on the gravity-mass-square ratio (LGMS) and density core (DCore) with a dynamic denoising radius. Our algorithm can obtain the correct clustering result in arbitrarily shaped datasets excluding noises without parameter settings. In this algorithm, DCore can maintain the shapes of clusters, and LGMS can help to extract local core points. In addition, a dynamic denoising radius is beneficial to detect noise points caused by arbitrary reasons and avoid interference at the beginning of the algorithm. The results of experiments on both synthetic datasets and real datasets show that our algorithm has excellent performance.
Similar content being viewed by others
References
Wang F, Zhou J, Tian Y, Wang Y, Zhang P, Chen J, Li J (2018) Intradialytic blood pressure pattern recognition based on density peak clustering. J Biomed Inform 83:33–39
Hu F, Chen H, Wang X (2020) An intuitionistic kernel-based fuzzy c-means clustering algorithm with local information for power equipment image segmentation. IEEE Access 4500–4514
AlMahmoud RH, Hammo B, Faris H (2020) A modified bond energy algorithm with fuzzy merging and its application to arabic text document clustering. Expert Syst Appl 159
Liu L, Xu B, Zhang X, Wu X (2018) An intrusion detection method for internet of things based on suppressed fuzzy clustering, Eurasip J Wireless Commun Netw 113
Sivaranjani S, Sivakumari S, Aasha M (2016) Crime prediction and forecasting in tamilnadu using clustering approaches. Int Conf Emerg Technol Trends IEEE 1–6
ichi Fukui K, Okada Y, Satoh K, Numao M (2019) Cluster sequence mining from event sequence data and its application to damage correlation analysis. 179: 136–144
Alguliyev RM, Aliguliyev RM, Sukhostat LV (2020) Weighted consensus clustering and its application to big data. 150
Wang R, Fung BC, Zhu Y (2020) Heterogeneous data release for cluster analysis with differential privacy. Knowl-Based Syst 201–202
Feldman D, Schmidt M, Sohler C (2018) Turning big data into tiny data: Constant-size coresets for k-means, pca and projective clustering. Symposium on Discrete Algorithms 1434–1453
Xu Q, Zhang Q, Liu J, Luo B (2020) Efficient synthetical clustering validity indexes for hierarchical clustering. Expert Syst Appl 151
Xie WB, Lee YL, Wang C, Chen DB, Zhou T (2020) Hierarchical clustering supported by reciprocal nearest neighbors. Inf Sci 527:279–292
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Knowledge Discovery and Data Mining 226–231
Hireche C, Drias H, Moula H (2020) Grid based clustering for satisfiability solving. Appl Soft Comput 88
Deng C, Song J, Sun R, Cai S, Shi Y (2018) Griden: an effective grid-based and density-based spatial clustering algorithm to support parallel computing. Pattern Recogn Lett 109:81–88
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24 (7):881–892
Sheng W, Liu X (2006) A genetic k-medoids clustering algorithm. J Heurs 12(6):447–466
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin C (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Zhang M, Sun S, Cao G, Kong X, Zhao X, Zong S (2019) Load characteristics analysis based on improved k-means clustering algorithm
Yu H, Wen G, Gan J, Zheng W, Lei C (2018) Self-paced learning for k -means clustering algorithm. Pattern Recogn Lett 132:69– 75
Ohadi N, Kamandi A, Shabankhah M, Fatemi M (2020) Sw-dbscan: A grid-based dbscan algorithm for large datasets 2020 6th. International Conference on Web Research (ICWR)
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 334 (6191):1492–1496
Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inform Sci 200–226
Chen Y, Tang S, Zhou L, Wang C, Du J, Wang T, Pei S (2016) Decentralized clustering by finding loose and distributed density cores. Inf Sci 510–526
Wang Z, Yu Z, Chen CLP, You J, Gu T, Wong HS, et al. (2018) Clustering by local gravitation. IEEE Trans Cybern 48(5):2168–2267
Cheng D, Zhu Q, Huang J, Yang L, Wu Q (2017) Natural neighbor-based clustering algorithm with local representatives. Knowl Based Syst 123:238–253
Cheng D, Zhu Q, Huang J, Yang L (2016) Natural neighbor-based clustering algorithm with density peeks. Int Joint Conf Neural Netw IEEE 123
Zhu Q, Feng J, Huang J (2016) Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter. Cluster Computing 19(3)
Jiang D, Zang W, Sun R, Wang Z, Liu X (2020) Adaptive density peaks clustering based on k-nearest neighbor and gini coefficient. IEEE Access (99): 1
Islam MS, Shen B, Wang C, Taniar D, Wang J (2020) Efficient processing of reverse nearest neighborhood queries in spatial databases. Information Systems 92(101530)
Drugman T (2013) Residual excitation skewness for automatic speech polarity detection. IEEE Signal Process 22(16):387–390
Gomez J, Dasgupta D, Nasraoui O (2003) A new gravitational clustering algorithm. Third Siam International Conference on Data 83–94
Li Q, Wang S, Zhao C, Zhao B, Yue X, Geng J (2009) Hibog: Improving the clustering accuracy by meliorating dataset with gravitation. Information Sciences 2176(1)
Zhang ZY (2020) Comment on improved mutual information measure for clustering, classification and community detection
Ricard Marxer HP (2008) An f-measure for evaluation of unsupervised clustering with non-determined number of clusters
Lichman M (2013) Uci machine learning repository. http://archive.ics.uci.edu/ml
Wang H, Yang Y, Liu B, Fujita H (2019) A study of graph-based system for multi-view clustering. Knowl-Based Syst 175:118– 129
Zhang Y, Yang Y, Li T, Fujita H (2019) A multitask multiview clustering algorithm in heterogeneous situations based on lle and le. Knowl-Based Syst 163:776–786
Xiao Q, Dai J, Luo J, Fujita H (2019) Multi-view manifold regularized learning-based method for prioritizing candidate disease mirnas. Knowl-Based Syst 175:118–129
Zhang X, Yang Y, Li T, Zhang Y, Wang H, Fujita H (2021) Cmc: a consensus multi-view clustering model for predicting alzheimer’s disease progression. Knowl-Based Syst 199
Greene D. (2000) Sfi insight centre for data analytics. http://mlg.ucd.ie/datasets.html
Acknowledgements
The authors would like to thank the editor and anonymous reviewers for their valuable comments and suggestions. This work is funded by the National Natural Science Foundation of China (no. 61701051), Fundamental Research Funds for the Central Universities (no. 2019CDCGJSJ329) and graduate scientific research and innovation foundation of Chongqing, China (Grant no. CYS20067).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflicts of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, YF., Wang, YQ., Li, GG. et al. A novel clustering algorithm based on the gravity-mass-square ratio and density core with a dynamic denoising radius. Appl Intell 52, 8924–8946 (2022). https://doi.org/10.1007/s10489-021-02753-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02753-0