Skip to main content
Log in

Clustering by differencing potential of data field

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Hierarchical clustering with data field can find clusters with various shape and filter the noises in data set without input parameters. However, its clustering process is complex and cannot effectively deal with complex and high dimensional data. In this paper, a novel clustering algorithm is proposed by differencing potential (DP) of data field. The potential difference specifies the nearest object which has high potential as the aggregation direction, and the data distance is used to divide the global data set into local multiple clusters. Simultaneously, noises are identified effectively in the light of the potential of data field. Experimental results on eight popular data sets and a facial image data set indicate that the proposed method outperforms existing clustering algorithms for dealing with data set with high dimensions and distribution in complex shape, as well as noise identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Li D, Wang S, Li D (2015) Spatial data mining: theory and application. Springer, Berlin, Germany

    Book  Google Scholar 

  2. Wang S, Gan W, Li D et al (2011) Data field for hierarchical clustering. Int J Data Warehous Min 7(4):43–63

    Article  Google Scholar 

  3. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  4. Estivill-Castro V (2002) Why so many clustering algorithms: a position paper. ACM SIGKDD Explor Newslett 4(1):65–75

    Article  Google Scholar 

  5. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, Hoboken, New Jersey, USA

    Book  MATH  Google Scholar 

  6. Zha H, He X, Ding C et al (2001) Spectral relaxation for k-means clustering. In: Advances in neural information processing systems (pp 1057–1064)

  7. Ester M, Kriegel HP, Sander J et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD (vol 96, no 34, pp 226–231)

  8. Ankerst M, Breunig MM, Kriegel HP et al (1999) OPTICS: ordering points to identify the clustering structure//ACM sigmod record. ACM 28(2):49–60

    Google Scholar 

  9. Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86

    Article  MATH  Google Scholar 

  10. Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280

    Article  Google Scholar 

  11. Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75

    Article  Google Scholar 

  12. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Applications of computer vision, proceedings of the second IEEE workshop on (pp 138–142). IEEE

  13. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  14. Sampat MP, Wang Z, Gupta S et al (2009) Complex wavelet structural similarity: a new image similarity index. IEEE Trans Image Process 18(11):2385–2401

    Article  MathSciNet  MATH  Google Scholar 

  15. Liu W, He J, Chang SF (2010) Large graph construction for scalable semi-supervised learning. In: International conference on machine learning (pp 679-686). DBLP

Download references

Acknowledgements

The work is supported in part by National Key Research and Development Plan of China (2016YFC0803004), National Natural Science Fund of China (61472039) and Beijing Major Science and Technology (Z171100005117002).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hanning Yuan or Jing Geng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Wang, S., Yuan, H. et al. Clustering by differencing potential of data field. Computing 100, 403–419 (2018). https://doi.org/10.1007/s00607-018-0605-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-018-0605-x

Keywords

Mathematics Subject Classification

Navigation