Abstract
Density peak clustering is able to recognize clusters of arbitrary shapes, so it has attracted attention in academic community. However, existing density peak clustering algorithms prefer to select cluster centers from dense regions and thus easily ignore clusters from sparse regions. To solve this problem, we redefine the local density of a point as the number of points whose neighbors contain this point. This idea is based on our following finding: whether in dense clusters or in sparse clusters, a cluster center would have a relatively high local density calculated by our new measure. Even in a sparse region, there may be some points with high local densities in our definition, thus one of these points can be selected to be the center of this region in subsequent steps and this region is then detected as a cluster. We apply our new definition to both density peak clustering and the combination of density peak clustering with agglomerative clustering. Experiments on benchmark datasets show the effectiveness of our methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Baraldi, A., Blonda, P.: A survey of fuzzy clustering algorithms for pattern recognition I. IEEE Trans. Syst. Man Cybern. Part B 29(6), 778–785 (1999)
Brecheisen, S., Kriegel, H.-P., Pfeifle, M.: Parallel density-based clustering of complex objects. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 179–188. Springer, Heidelberg (2006). https://doi.org/10.1007/11731139_22
Chen, G., Zhang, X., Wang, Z.J., Li, F.: Robust support vector data description for outlier detection with noise or uncertain data. Knowl.-Based Syst. 90, 129–137 (2015)
Cheng, D., Zhu, Q., Huang, J., Yang, L., Wu, Q.: Natural neighbor-based clustering algorithm with local representatives. Knowl.-Based Syst. 123, 238–253 (2017)
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Du, M., Ding, S., Jia, H.: Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl.-Based Syst. 99, 135–145 (2016)
Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: ICDM 2003, pp. 47–58. SIAM (2003)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996. vol. 96, pp. 226–231 (1996)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Jing, L., Ng, M.K., Xu, J., Huang, J.Z.: Subspace clustering of text documents with feature weighting K-means algorithm. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 802–812. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_94
Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. IEEE Comput. 32(8), 68–75 (1999)
Liu, Y., Ma, Z., Yu, F.: Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy. Knowl.-Based Syst. 133, 208–220 (2017)
Mehmood, R., El-Ashram, S., Bie, R., Dawood, H., Kos, A.: Clustering by fast search and merge of local density peaks for gene expression microarray data. Sci. Rep. 7 (2017)
Meila, M., Shi, J.: A random walks view of spectral segmentation. In: AISTATS 2001 (2001)
Nene, S.A., Nayar, S.K., Murase, H., et al.: Columbia object image library (coil-20) (1996)
Nie, F., Wang, X., Jordan, M.I., Huang, H.: The constrained Laplacian rank algorithm for graph-based clustering. In: AAAI 2016, pp. 1969–1976 (2016)
Rezaei, M., Fränti, P.: Set-matching methods for external cluster validity. IEEE Trans. Knowl. Data Eng. 28(8), 2173–2186 (2016)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Samaria, F., Harter, A.: Parameterisation of a stochastic model for human face identification. In: WACV 1994, pp. 138–142. IEEE (1994)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)
Xie, J., Gao, H., Xie, W., Liu, X., Grant, P.W.: Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf. Sci. 354, 19–40 (2016)
Xu, J., Wang, G., Deng, W.: Denpehc: density peak based efficient hierarchical clustering. Inf. Sci. 373, 200–218 (2016)
Zaïane, O.R., Foss, A., Lee, C.-H., Wang, W.: On data clustering analysis: scalability, constraints, and validation. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 28–39. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47887-6_4
Zhang, W., Wang, X., Zhao, D., Tang, X.: Graph degree linkage: agglomerative clustering on a directed graph. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 428–441. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_31
Zhang, W., Li, J.: Extended fast search clustering algorithm: widely density clusters, no density peaks. arXiv preprint arXiv:1505.05610 (2015)
Zhu, W., Wang, F.Y.: Reduction and axiomization of covering generalized rough sets. Inf. Sci. 152, 217–230 (2003)
Zhu, W., Wang, F.Y.: On three types of covering-based rough sets. IEEE Trans. Knowl. Data Eng. 19(8), 1131–1144 (2007)
Acknowledgements
This work was partially supported by the National Natural Science Foundation of China under Grant No. 61772120.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Guo, Z., Huang, T., Cai, Z., Zhu, W. (2018). A New Local Density for Density Peak Clustering. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-93040-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93039-8
Online ISBN: 978-3-319-93040-4
eBook Packages: Computer ScienceComputer Science (R0)