Skip to main content
Log in

Semantic feature hierarchical clustering algorithm based on improved regional merging strategy

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Hierarchical clustering based on semantic feature thresholds is well applied in data mining. However, this method uses systematic sampling to select the edge points of the region during the region merging phase, resulting in the problems of under-consolidation and computational complexity when region merging based on the Euclidean distance criterion. To solve this problem, an improved hierarchical clustering method is proposed, which adopts the weighted Euclidean distance and optimizes the components of region combination. According to the orientation of the centers of two region clustering and the geometric relation among the edge points, two novel type of edge point selection strategies are established to reduce the number of edge points involved in the operation and ensure that the selected ones possess the minimum Euclidean distance in all edge point combinations. Experimental results show that the main advantages of this research project lie in its high clustering accuracy and less average clustering time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Hencil, Peter J.: Fast and economic clustering algorithms in data mining an analytical research. Nat. Rev. Microbiol. 6(5), 339–348 (2014)

    Google Scholar 

  2. Wu, D., Olson, D.L.: A TOPSIS data mining demonstration and application to credit scoring. Int. J. Data Warehous. Min. 2(3), 16–26 (2017)

    Article  Google Scholar 

  3. Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014)

    Article  Google Scholar 

  4. Hencil, Peter J.: Fast and economic clustering algorithms in data mining an analytical research. Nat. Rev. Microbiol. 6(5), 339–348 (2014)

    Google Scholar 

  5. Sanakal, R., Jayakumari, S.T.: Prognosis of diabetes using data mining approach-fuzzy c means clustering and support vector machine. Int. J. Comput. Trends Technol. 11(2), 94–98 (2014)

    Article  Google Scholar 

  6. Kaur, M., Garg, S.K.: Survey on clustering techniques in data mining for software engineering. Indian J. Sci. Technol. 3(4), 238–243 (2014)

    Google Scholar 

  7. Sarumathi, S., Shanthi, N., Sharmila, M.: A comparative analysis of different categorical data clustering ensemble methods in data mining. Int. J. Comput. Appl. 81(4), 46–55 (2014)

    Google Scholar 

  8. Saravanan, D.: Text information retrieval using data mining clustering technique. Int. J. Appl. Eng. Res. 10(3), 7865–7873 (2015)

    Google Scholar 

  9. Innovative, P.I.I.: Application of clustering data mining techniques in temporal data sets of hydrology: a review. Int. J. Sci. Eng. Technol. 3(4), 359–363 (2014)

    Google Scholar 

  10. Park, I.K.: Clustering algorithm for data mining using posterior probability-based information entropy. J. Korea Soc. Comput. Inf. 12(12), 293–301 (2014)

    Google Scholar 

  11. Ren, D.Q., Zheng, D., Huang, G., et al.: Parallel set determination and k-means clustering for data mining on telecommunication networks. In: IEEE, International Conference on High Performance Computing and Communications & 2013, IEEE International Conference on Embedded and Ubiquitous Computing. IEEE, pp. 1553–1557. (2014)

  12. Yotsawat, W., Srivihok, A.: Data mining of international tourists in thailand by two step clustering and classification. J. Comput. Theor. Nanosci. 20(1), 245–249 (2014)

    Google Scholar 

  13. Zhongming, Han, Ni, Chen, Hui, Zhang, et al.: A hierarchical clustering algorithm for asymmetric distance. Pattern Recognit. Artif. Intell. 27(05), 410–416 (2014)

    Google Scholar 

  14. Zhang, Q., Wang, Q., que Kim, S.: Journal of condensation, hierarchical clustering analysis method of rock mass structural plane random packet advantage. Chin. J. Geotech. Eng. 36(08), 1432–1437 (2014)

    Google Scholar 

  15. LuoEntao, Jun, Wang Guo.: A hierarchical clustering method based on semantic feature threshold in large data. J. Electr. Inf. 37(12), 2795–2801 (2015)

    Google Scholar 

  16. Jian, Hu, Bingru, Yang, Zefeng, Song, et al.: Web text clustering algorithm based on unstructured data mining structure model. J. Univ. Sci. Technol. Beijing 30(2), 217–220 (2008)

    Google Scholar 

  17. Luo, E.T., Wang, G.J.: A hierarchical clustering method based on the threshold of semantic feature in big data. J. Electr. Inf. Technol. 37, 2796–2800 (2015)

    Google Scholar 

  18. Tan, Y.H., Li, B., Li, X.Y., et al.: Designing a super-peer semantic network based on hierarchical clusters. In: Information Computing and Telecommunications. IEEE, pp. 194–197. (2010)

  19. Clerkin P, Cunningham P, Hayes C.: Ontology discovery for the semantic web using hierarchical clustering. Trinity College Dublin, Department of Computer Science (2001)

  20. Rocha, A.R., Pirmez, L., Delicato, F.C., et al.: WSNs clustering based on semantic neighborhood relationships. Comput. Netw. 56(5), 1627–1645 (2012)

    Article  Google Scholar 

  21. Castro, R.M., Coates, M.J., Nowak, R.D.: Likelihood based hierarchical clustering[J]. Sig. Process. IEEE Trans. 52(8), 2308–2321 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  22. Weia, C.P.: A Latent Semantic Indexing-based approach to multilingual document clustering. Deci. Support. Sys. 45(3), 606–620 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yao yu-feng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

yu-feng, Y. Semantic feature hierarchical clustering algorithm based on improved regional merging strategy. Cluster Comput 22 (Suppl 1), 1495–1503 (2019). https://doi.org/10.1007/s10586-018-1941-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1941-5

Keywords

Navigation