Skip to main content

Improved CFDP Algorithms Based on Shared Nearest Neighbors and Transitive Closure

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10526))

Abstract

A recently proposed clustering algorithm named Clustering by fast search and Find of Density Peaks (CFDP) can automatically identify the cluster centers without an iterative process. The key step in CFDP is searching for the nearest neighbor with higher density for each point. However, the CFDP algorithm may not be effective for cases in which there exist density fluctuations within a cluster or between two nearby clusters. In this study, two improved algorithms named CFDP-ED-TSNN1 and CFDP-ED-TSNN2 are presented, which adopt different ways to utilize the dissimilarity. Here, the dissimilarity is based on shared nearest neighbors and transitive closure. The experimental results on both several artificial datasets and a real-world dataset show that the improved algorithms are competitive.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231. AAAI (1996)

    Google Scholar 

  2. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press (1967)

    Google Scholar 

  3. Yan, Z., Luo, W., Bu, C., Ni, L.: Clustering spatial data by the neighbors intersection and the density difference. In: Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, pp. 217–226. ACM (2016)

    Google Scholar 

  4. Gao, W., Luo, W., Bu, C., Ni, L., Zhang, D.: Clustering evolutionary data with an r-dominance based multi-objective evolutionary algorithm. In: Yin, H., Gao, Y., Li, B., Zhang, D., Yang, M., Li, Y., Klawonn, F., Tallón-Ballesteros, A.J. (eds.) IDEAL 2016. LNCS, vol. 9937, pp. 342–352. Springer, Cham (2016). doi:10.1007/978-3-319-46257-8_37

    Google Scholar 

  5. Gao, W., Luo, W., Bu, C.: Evolutionary community discovery in dynamic networks based on leader nodes. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 53–60. IEEE (2016)

    Google Scholar 

  6. Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)

    Article  Google Scholar 

  7. Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared near neighbors. IEEE Trans. Comput. 100(11), 1025–1034 (1973)

    Article  Google Scholar 

  8. Lee, H.-S.: An optimal algorithm for computing the max–min transitive closure of a fuzzy similarity matrix. Fuzzy Sets Syst. 123(1), 129–136 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  9. Sun, P.G., Gao, L., Han, S.S.: Identification of overlapping and non-overlapping community structure by fuzzy clustering in complex networks. Inf. Sci. 181(6), 1060–1071 (2011)

    Article  MATH  Google Scholar 

  10. Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision, pp. 138–142. IEEE (1994)

    Google Scholar 

  11. Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1073–1080. ACM (2009)

    Google Scholar 

  12. Chen, M., Li, L.J., Wang, B., Cheng, J.J., Pan, L.N., Chen, X.Y.: Effectively clustering by finding density backbone based-on kNN. Pattern Recogn. 60, 486–498 (2016)

    Article  Google Scholar 

  13. Ankerst, M., Breunig, M.M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: ACM Sigmod Record, pp. 49–60. ACM (1999)

    Google Scholar 

  14. Huang, H., Gao, Y., Chiew, K., Chen, L., He, Q.: Towards effective and efficient mining of arbitrary shaped clusters. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 28–39. IEEE (2014)

    Google Scholar 

  15. Sampat, M.P., Wang, Z., Gupta, S., Bovik, A.C., Markey, M.K.: Complex wavelet structural similarity: a new image similarity index. IEEE Trans. Image Process. 18(11), 2385–2401 (2009)

    Article  MathSciNet  Google Scholar 

  16. Jain, Anil K., Law, M.H.C.: Data clustering: a user’s dilemma. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 1–10. Springer, Heidelberg (2005). doi:10.1007/11590316_1

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the Anhui Provincial Natural Science Foundation (No. 1408085MKL07).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenjian Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ni, L., Luo, W., Bu, C., Hu, Y. (2017). Improved CFDP Algorithms Based on Shared Nearest Neighbors and Transitive Closure. In: Kang, U., Lim, EP., Yu, J., Moon, YS. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10526. Springer, Cham. https://doi.org/10.1007/978-3-319-67274-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67274-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67273-1

  • Online ISBN: 978-3-319-67274-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics