Skip to main content
Log in

A potential-based clustering method with hierarchical optimization

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

This work proposes a novel data clustering algorithm based on the potential field model, with a hierarchical optimization mechanism on the algorithm. There are two stages in this algorithm. Firstly, we build an edge-weighted tree based on the mutual distances between all data points and their hypothetical potential values derived from the data distribution. Using the tree structure, the dataset can be divided into an appropriate number of initial sub-clusters, with the cluster centers close to the local minima of the potential field. Then the sub-clusters are further merged according to the well-designed merging criteria by analyzing their border potential values and the cluster average potential values. The proposed clustering algorithm follows a hierarchical clustering mechanism, and aims to optimize the initial sub-cluster results in the first stage. The algorithm takes advantage of the cluster merging criteria to merge the sub-clusters, so it can automatically stop the clustering process without designating the number of clusters in advance. The experimental results show that the proposed algorithm produces the most satisfactory clustering results in most cases compared with other existing methods, and can effectively identify the data clusters with arbitrary shape, size and density.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml

References

  1. Bahrololoum, A., Nezamabadi-pour, H., Saryazdi, S.: A data clustering approach based on universal gravity rule. Eng. Appl. Artif. Intel. 45, 415–428 (2015)

    Article  Google Scholar 

  2. Chang, H., Yeung, D.-Y.: Robust path-based spectral clustering. Pattern Recogn. 41, 191–203 (2008)

    Article  Google Scholar 

  3. Endo, Y., Iwata, H.: Dynamic clustering based on universal gravitation model. In: International Conference on Modeling Decisions for Artificial Intelligence, pp. 183–193 (2005)

    Chapter  Google Scholar 

  4. Ester, M., Kriegel, H., Sander, J., Xiaowei, X.: A density-based algorithm for discovery clusters in large spatial databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)

  5. Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recogn. 41, 176–190 (2008)

    Article  Google Scholar 

  6. Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 553–569 (2012)

    Article  Google Scholar 

  7. Gao, J., Zhao, L., Chen, Z., Li, P., Han, X., Hu, Y.: Icfs: An improved fast search and find of density peaks clustering algorithm. In: International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress, pp. 537–543 (2016)

  8. Jain, A.K.: Data clustering: a user’s dilemma. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 1–10 (2005)

    Google Scholar 

  9. Jain, A.K.: Dataclustering: 50 years beyond k-means. Pattern Recogn. Lett. 31, 651–666 (2010)

    Article  Google Scholar 

  10. Kleinberg, J.: An impossibility theorem for clustering. In: Annural Conference on Neural Information Processing Systems, pp. 463–470 (2002)

  11. Liu, X., Liu, Y., Xie, Q.: A potential-based clustering method by fast search and find of cluster centers. In: Proceedings of the Big Data Partitioning and Mining Workshop associated with 2017 IEEE International Conference on Big Knowledge (2017)

  12. Lu Y., Wan, Y.: Clustering by sorting potential values (cspv): a novel potential-based clustering method. Pattern Recogn. 45, 3512–3522 (2012)

    Article  Google Scholar 

  13. Lu, Y., Yi, W.: Pha: a fast potential-based hierarchical agglomerative clustering method. Pattern Recogn. 46, 1227–1239 (2013)

    Article  Google Scholar 

  14. Omran, M.G.H., Engelbrecht, A.P., Salman, A.: An overview of clustering methods. Intelligent Data Analysis 11, 583–605 (2007)

    Article  Google Scholar 

  15. Peng, L., Bo, Y., Chen, Y., Abraham, A.: Data gravitation based classification. Inform. Sci. 179, 809–819 (2009)

    Article  Google Scholar 

  16. Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014)

    Article  Google Scholar 

  17. Rostami, A., Lashkari, M.: Extended pso algorithm for improvement problems k-means clustering algorithm. International Journal of Managing Information Technology 6, 17–29 (2014)

    Article  Google Scholar 

  18. Shang, S., Chen, L., Jensen, C.S., Wen, J.-R., Kalnis, P.: Seasearch trajectories by regions of interest. IEEE Trans. Knowl. Data Eng. 29, 1549–1562 (2017)

    Article  Google Scholar 

  19. Shang, S., Guo, D., Liu, J., Zheng, K., Wen, J.: Finding regions of interest using location based social media. Neurocomputing 173, 118–123 (2016)

    Article  Google Scholar 

  20. Shang, S., Liu, J., Zhao, K., Yang, M., Zheng, K., Wen, J.-R.: Dimension reduction with meta object-groups for efficient image retrieval. Neurocomputing 169, 50–54 (2015)

    Article  Google Scholar 

  21. Shang, S., Zheng, K., Jensen, C.S., Yang, B., Kalnis, P., Li, G., Wen, J.: Discovery of path nearby clusters in spatial networks. IEEE Trans. Knowl. Data Eng. 27, 1505–1518 (2015)

    Article  Google Scholar 

  22. Shi, S., Yang, G., Wang, D., Zheng, W.: Potential-based hierarchical clustering. In: International Conference on Pattern Recognition, pp. 272–275 (2002)

  23. Tu, Q., Lu, J., Yuan, B., Tang, J.B., Yang, J.Y.: Density-based hierarchical clustering for streaming data. Pattern Recogn. Lett. 33, 641–645 (2012)

    Article  Google Scholar 

  24. Wright, W.E.: Gravitational clustering. Pattern Recogn. 9, 151–166 (1977)

    Article  Google Scholar 

  25. Xu, R., Wunsch, D.C.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16, 645–678 (2005)

    Article  Google Scholar 

  26. Xu, X., Ester, M., Kriegel, H., Sander, J.: A distribution-based clustering algorithm for mining in large spatial databases. In: International Conference on Data Engineering, pp. 324–331 (1998)

  27. Yamachi, H., Kambayashi, Y., Tsujimura, Y.: A clustering method based on potential field. In: The 10th Asia Pacific Industrial Engineering & Management System Conference, pp. 846–855 (2009)

  28. Zhao, Q., Shi, Y., Liu, Q., Franti, P.: A grid-growing clustering algorithm for geo-spatial data. Pattern Recogn. Lett. 53, 77–84 (2015)

    Article  Google Scholar 

  29. Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26, 1974–1988 (2014)

    Article  Google Scholar 

  30. Zhu, J., Xie, Q., Zheng, K.: An improved early detection method of type-2 diabetes mellitus using multiple classifier system. Inform. Sci. 292, 1–14 (2015)

    Article  Google Scholar 

  31. Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Transactions on Cybernetics 46, 450–461 (2016)

    Article  Google Scholar 

  32. Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Transactions on Neural Networks and Learning Systems 26, 1263–1275 (2016)

    MathSciNet  Google Scholar 

  33. Zhu, X., Zhang, L., Zi, H.: A sparse embedding and least variance encoding approach to hashing. IEEE Trans. Image Process. 23, 3737–3750 (2014)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Xie.

Additional information

This article belongs to the Topical Collection: Special Issue on Deep Mining Big Social Data

Guest Editors: Xiaofeng Zhu, Gerard Sanroma, Jilian Zhang, and Brent C. Munsell

Part of the results in this work appeared in proceedings of the Big Data Partitioning and Mining Workshop associated with 2017 IEEE International Conference on Big Knowledge [11].

This research is partially supported by Natural Science Foundation of China (Grant No.61602353), National Social Science Foundation of China (Grant No.15BGL048) and the Fundamental Research Funds for the Central Universities (WUT:2017IVA053, WUT:2017IVB028 and WUT:2017II39GX).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Liu, Y., Xie, Q. et al. A potential-based clustering method with hierarchical optimization. World Wide Web 21, 1617–1635 (2018). https://doi.org/10.1007/s11280-017-0509-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0509-2

Keywords

Navigation