Skip to main content

Clustering by Searching Density Peaks via Local Standard Deviation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10585))

Abstract

To solve the problem of DPC (Clustering by fast search and find of Density Peaks) that it cannot find the cluster centers coming from sparse clusters, a new clustering algorithms is proposed in this paper. The proposed clustering algorithm uses the local standard deviation of point i to define its local density \(\rho _i\), such that all the cluster centers no matter whether they come from dense clusters or sparse clusters will be found as the density peaks. We named the new clustering algorithm as SD_DPC. The power of SD_DPC was tested on several synthetic data sets. Three data sets comprise both dense and sparse clusters with various number of points. The other data set is a typical synthetic one which is often used to test the performance of a clustering algorithm. The performance of SD_DPC is compared with that of DPC, and that of our previous work KNN-DPC (K-nearest neighbors DPC) and FKNN-DPC (Fuzzy weighted K-nearest neighbors DPC). The experimental results demonstrate that the proposed SD_DPC is superior to DPC, KNN-DPC and FKNN-DPC in finding cluster centers and the clustering of a data set.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alex, R., Alessandro, L.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)

    Article  Google Scholar 

  2. Dan, F., Melanie, S., Christian, S.: Turning big data into tiny data: constant-size coresets for k-means, PCA and projective clustering. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2013, pp. 1434–1453. SIAM (2013), http://dl.acm.org/citation.cfm?id=2627817.2627920

  3. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann (2011)

    Google Scholar 

  5. Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  6. Karkkainen, I., Franti, P.: Dynamic local search for clustering with unknown number of clusters. In: Proceedings of the 16th International Conference on Pattern Recognition, vol. 2, pp. 240–243. IEEE (2002)

    Google Scholar 

  7. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, Statistics, Oakland, CA, USA, pp. 281–297 (1967)

    Google Scholar 

  8. Mehmood, R., EI-AShram, S., Bie, R., Dawood, H., Kos, A.: Clustering by fast search and merge local density peaks for gene expression microarray data. Sci. Rep. 7, 45602 (2017)

    Google Scholar 

  9. Tong, H., Kang, U.: Big data clustering. In: Aggarwal, C.C., Reddy, C.K. (eds.) Data Clustering: Algorithms and Applications, chap. 11, pp. 259–276. CRC Press (2013)

    Google Scholar 

  10. Von Luxburg, U., Williamson, R.C., Guyon, I.: Clustering: science or art? J. Mach. Learn. Res. Proc. Track 27, 65–80 (2012)

    Google Scholar 

  11. Xie, J., Gao, H.: Statistical correlation and k-means based distinguishable gene subset selection algorithms. J. Softw. 25(9), 2050–2075 (2014)

    Google Scholar 

  12. Xie, J., Gao, H., Xie, W.: K-nearest neighbors optimized clustering algorithm by fast search and nding the density peaks of a dataset. SCIENTIA SINICA Informationis 46(2), 258–280 (2016)

    Article  Google Scholar 

  13. Xie, J., Gao, H., Xie, W., Liu, X., Grant, P.W.: Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors. Inf. Sci. 354, 19–40 (2016)

    Article  Google Scholar 

  14. Xie, J., Jiang, S., Xie, W., Gao, X.: An efficient global K-means clustering algorithm. J. Comput. 6(2), 271–279 (2011)

    Article  Google Scholar 

  15. Xie, J., Li, Y., Zhou, Y., Wang, M.: Differential feature recognition of breast cancer patients based on minimum spanning tree clustering and F-statistics. In: Yin, X., Geller, J., Li, Y., Zhou, R., Wang, H., Zhang, Y. (eds.) HIS 2016. LNCS, vol. 10038, pp. 194–204. Springer, Cham (2016). doi:10.1007/978-3-319-48335-1_21

    Chapter  Google Scholar 

  16. Xu, R., Wunsch, D.I.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

We are much obliged to those who provide the public data sets for us to use. This work is supported in part by the National Natural Science Foundation of China under Grant No. 61673251, is also supported by the Key Science and Technology Program of Shaanxi Province of China under Grant No. 2013K12-03-24, and is at the same time supported by the Fundamental Research Funds for the Central Universities under Grant No. GK201701006, and by the Innovation Funds of Graduate Programs at Shaanxi Normal University under Grant No. 2015CXS028 and 2016CSY009.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juanying Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Xie, J., Jiang, W., Ding, L. (2017). Clustering by Searching Density Peaks via Local Standard Deviation. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2017. IDEAL 2017. Lecture Notes in Computer Science(), vol 10585. Springer, Cham. https://doi.org/10.1007/978-3-319-68935-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68935-7_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68934-0

  • Online ISBN: 978-3-319-68935-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics