Abstract
A recently proposed clustering algorithm named Clustering by fast search and Find of Density Peaks (CFDP) can automatically identify the cluster centers without an iterative process. The key step in CFDP is searching for the nearest neighbor with higher density for each point. However, the CFDP algorithm may not be effective for cases in which there exist density fluctuations within a cluster or between two nearby clusters. In this study, two improved algorithms named CFDP-ED-TSNN1 and CFDP-ED-TSNN2 are presented, which adopt different ways to utilize the dissimilarity. Here, the dissimilarity is based on shared nearest neighbors and transitive closure. The experimental results on both several artificial datasets and a real-world dataset show that the improved algorithms are competitive.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231. AAAI (1996)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press (1967)
Yan, Z., Luo, W., Bu, C., Ni, L.: Clustering spatial data by the neighbors intersection and the density difference. In: Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, pp. 217–226. ACM (2016)
Gao, W., Luo, W., Bu, C., Ni, L., Zhang, D.: Clustering evolutionary data with an r-dominance based multi-objective evolutionary algorithm. In: Yin, H., Gao, Y., Li, B., Zhang, D., Yang, M., Li, Y., Klawonn, F., Tallón-Ballesteros, A.J. (eds.) IDEAL 2016. LNCS, vol. 9937, pp. 342–352. Springer, Cham (2016). doi:10.1007/978-3-319-46257-8_37
Gao, W., Luo, W., Bu, C.: Evolutionary community discovery in dynamic networks based on leader nodes. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 53–60. IEEE (2016)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared near neighbors. IEEE Trans. Comput. 100(11), 1025–1034 (1973)
Lee, H.-S.: An optimal algorithm for computing the max–min transitive closure of a fuzzy similarity matrix. Fuzzy Sets Syst. 123(1), 129–136 (2001)
Sun, P.G., Gao, L., Han, S.S.: Identification of overlapping and non-overlapping community structure by fuzzy clustering in complex networks. Inf. Sci. 181(6), 1060–1071 (2011)
Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1073–1080. ACM (2009)
Chen, M., Li, L.J., Wang, B., Cheng, J.J., Pan, L.N., Chen, X.Y.: Effectively clustering by finding density backbone based-on kNN. Pattern Recogn. 60, 486–498 (2016)
Ankerst, M., Breunig, M.M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: ACM Sigmod Record, pp. 49–60. ACM (1999)
Huang, H., Gao, Y., Chiew, K., Chen, L., He, Q.: Towards effective and efficient mining of arbitrary shaped clusters. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 28–39. IEEE (2014)
Sampat, M.P., Wang, Z., Gupta, S., Bovik, A.C., Markey, M.K.: Complex wavelet structural similarity: a new image similarity index. IEEE Trans. Image Process. 18(11), 2385–2401 (2009)
Jain, Anil K., Law, M.H.C.: Data clustering: a user’s dilemma. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 1–10. Springer, Heidelberg (2005). doi:10.1007/11590316_1
Acknowledgements
This work is partly supported by the Anhui Provincial Natural Science Foundation (No. 1408085MKL07).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ni, L., Luo, W., Bu, C., Hu, Y. (2017). Improved CFDP Algorithms Based on Shared Nearest Neighbors and Transitive Closure. In: Kang, U., Lim, EP., Yu, J., Moon, YS. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10526. Springer, Cham. https://doi.org/10.1007/978-3-319-67274-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-67274-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67273-1
Online ISBN: 978-3-319-67274-8
eBook Packages: Computer ScienceComputer Science (R0)