Abstract
To deal with the complex structure of the data set, density peaks clustering algorithm (DPC) was proposed in 2014. The density and the delta-distance are utilized to find the clustering centers in the DPC method. It detects outliers efficiently and finds clusters of arbitrary shape. But unfortunately, we need to calculate the distance between all data points in the first process, which limits the running speed of DPC algorithm on large datasets. To address this issue, this paper introduces a novel approach based on grid, called density peaks clustering algorithm based on grid (DPCG). This approach can overcome the operation efficiency problem. When calculating the local density, the idea of the grid is introduced to reduce the computation time based on the DPC algorithm. Neither it requires calculating all the distances nor much input parameters. Moreover, DPCG algorithm successfully inherits the all merits of the DPC algorithm. Experimental results on UCI data sets and artificial data show that the DPCG algorithm is flexible and effective.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Morris K, Mcnicholas PD (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat Data Anal 97:133–150
Cai R, Zhang Z, Tung AKH et al (2014) A general framework of hierarchical clustering and its applications. Inf Sci 272(C):29–48
Yu Z, Luo P, You J et al (2015) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714
Xia Z, Wang X, Zhang L et al (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Transactions on Information Forensics and Security 1–1
Yu Z, Li L, Liu J et al (2015) Adaptive noise immune cluster ensemble using affinity propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189
Yu Z, Chen H, You J et al (2015) Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data. IEEE/ACM Trans Comput Biol Bioinf 12(4):887–901
Ren Y, Shen J, Wang J et al (2015) Mutual verifiable provable data auditing in public cloud storage. J Internet Technol 16(2):317–323
Bayat A (2013) Uniform voltage distribution based constructive algorithm for optimal reconfiguration of electric distribution networks. Electr Power Syst Res 104(9):146–155
Yu Z, Zhu X, Wong HS et al (2016) Distribution-Based Cluster Structure Selection. IEEE Transactions on Cybernetics
Zhang Z, Ding F, Liu X (2011) Hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average systems. Comput Math Appl 61(3):672–682
Fraley C, Raftery AE (2011) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631
He Y, Tan H, Luo W, et al (2011) MR-DBSCAN: an efficient parallel density-based clustering algorithm using mapreduce. International Conference on Parallel and Distributed Systems Proceedings, pp 473–480
Li C, Li L, Zhang J et al (2012) Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi. J Comput Chem 33(24):1960–1966
Ros F, Guillaume S (2016) DENDIS: a new density-based sampling for clustering algorithm. Expert Syst Appl 56:349–359
Nanda SJ, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38
Yue SH, Wang JS, Tao G et al (2010) An unsupervised grid-based approach for clustering analysis. Sciece China Inf Sci 53(7):1345–1357
Huang J, Zhang X (2013) An incremental grid clustering algorithm based on density-dimension-tree. International Conference on Machine Learning and Cybernetics. IEEE, pp 356–361
Zhang D, Tian H, Ying P et al (2012) A clustering algorithm based on density-grid for stream data. International Conference on Parallel and Distributed Computing, Applications, and Technologies (PDCAT), 398–403
Zhong Z (2011) A kind of data stream clustering algorithm based on grid-density. Commun Comput Inf Sci 215:418–423
Rodrigitez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Zheng Y, Byeungwoo J, Xu D et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):4024–4028
Arora P, Deepali, Varshney S (2016) Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput Sci 78:507–512
Jia H, Ding S, Du M (2015) Self-tuning P-spectral clustering based on shared nearest neighbors. Cogn Comput 7(5):622–632
Yu Z, Li L, You J et al (2012) SC3: triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles. IEEE/ACM Trans Comput Biol Bioinf 9(6):1751–1765
Cheng Q, Liu Z, Huang J et al (2016) Community detection in hypernetwork via density-ordered tree partition. Appl Math Comput 276(C):384–393
Yin Y, Wang X, Xu D et al (2016) Robust visual detection-learning-tracking framework for autonomous aerial refueling of UAVs. IEEE Trans Instrum Meas 65(3):1–12
Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145
Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402
Guo L, Lin JH, Guo Q et al (2015) Identifying multiple influential spreaders in term of the distance-based coloring. Phys Lett A 380(7–8):837–842
Chen B, Shu H, Coatrieux G et al (2014) Color image analysis by quaternion-type moments. J Math Imaging Vision 51(1):124–144
Kumar KM, Reddy ARM (2016) A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recogn 58:39–48
Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K -nearest neighbors. Inf Sci 354:19–40
Wang S, Wang D, Caoyuan LI et al (2016) Clustering by fast search and find of density peaks with data field. Chin J Electron 25(3):397–402
Liang Z, Chen P (2016) Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73(C):52–59
Ban Z, Liu J, Yuan L et al (2015) A modified density-based clustering algorithm and its implementation. PATTERN RECOGNITION AND COMPUTER VISION, 9813
Wang D, Zhang B, Wang K (2014) A vertex-clustering algorithm based on the cluster-clique. Algorithms Archit Parallel Process 8631:376–385
Bohn B, Garcke J, Griebel M (2016) A sparse grid based method for generative dimensionality reduction of high-dimensional data. J Comput Phys 309:1–17
Fu Z, Sun X, Liu Q, et al (2015) Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. Ieice Trans Commun E98.B(1):190–200
Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352
Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280
Fränti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recogn 39(5):761–775
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):341–352
Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinf 8(1):1–15
Kärkkäinen I, Fränti P (2002) Dynamic local search algorithm for the clustering problem. Technical Report A-2002-6, Department of Computer Science, University of Joensuu
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, X., Ding, S., Du, M. et al. DPCG: an efficient density peaks clustering algorithm based on grid. Int. J. Mach. Learn. & Cyber. 9, 743–754 (2018). https://doi.org/10.1007/s13042-016-0603-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0603-2