Skip to main content
Log in

Representative points clustering algorithm based on density factor and relevant degree

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Most of the existing clustering algorithms are affected seriously by noise data and high cost of time. In this paper, on the basis of CURE algorithm, a representative points clustering algorithm based on density factor and relevant degree called RPCDR is proposed. The definition of density factor and relevant degree are presented. The primary representative point whose density factor is less than the prescribed threshold will be deleted directly. New representative points can be reselected from non representative points in corresponding cluster. Moreover, the representative points of each cluster are modeled by using K-nearest neighbor method. Relevant degree is computed by comprehensive considering the correlations of objects within a cluster and between different clusters. And then whether the two clusters need to merge is judged. The theoretic experimental results and analysis prove that RPCDR has better clustering accuracy and execution efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Hou SZ, Zhang XF (2008) Analysis and research for network management alarms correlation based on sequence clustering algorithm. In: Proceedings of the 2008 international conference on intelligent computation technology and automation, pp 982–986

  2. Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. J Intell Data Anal 18:137–156

    Google Scholar 

  3. Sharif MA, Raghavan VV (2014) A clustering based scalable hybrid approach for web page recommendation. In: Proceedings of 2014 IEEE international conference on big data, pp 80–87

  4. Sheu TL, Lin YH (2014) A cluster-based TDMA system for inter-vehicle communications. J Inf Sci Eng 30:213–231

    Google Scholar 

  5. Pichara K, Soto A (2011) Active learning and subspace clustering for anomaly detection. J Intell Data Anal 15:151–171

    Google Scholar 

  6. Guha S, Rastogi R, Shim K (2001) CURE: an efficient clustering algorithm for large databases. J Inf Syst 26:35–58

    Article  MATH  Google Scholar 

  7. Zhang JJ, Peng YW, Li HF (2013) A new semiparametric estimation method for accelerated hazards mixture cure model. J Comput Stat Data Anal 59:95–102

    Article  MathSciNet  Google Scholar 

  8. Wang XJ, Shen H (2009) Clustering high dimensional data streams with representative points. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery, pp 449–453

  9. DelibasiC B, VukiCeviC M, JovanoviC M, Kirchner K (2012) An architecture for component-based design of representative-based blustering algorithms. J Data Knowl Eng 75:78–98

    Article  Google Scholar 

  10. Cesmeci D, Gullu MK (2009) Phase-correlation-based hyperspectral image classification using multiple class representatives obtained with K-means clustering. Int J Remote Sens 30:3827–3834

    Article  Google Scholar 

  11. Pang YJ, Pan W, Liu KD (2010) A supervised clustering algorithm based on representative points and its application to fault diagnosis of diesel engine. J Adv Mater Res 121–122:958–963

    Article  Google Scholar 

  12. Chen EH, Wang SF, Yan N, Wang XF (2001) The design and implementation of clustering algorithm using representative data. J Pattern Recognit Artif Intell 14:417–422

    Google Scholar 

  13. Huang TQ, Qin XL, Wang JD (2006) Multi-representation feature tree and spatial clustering algorithm. J Comput Sci 33:189–195

    Google Scholar 

  14. Jia RY, Geng JW, Ning ZZ, He CG (2010) Fast clustering algorithm based on representative points. J Comput Eng Appl 46:121–126

    Google Scholar 

  15. Arajo D, Neto AD (2013) Information-theoretic clustering: a representative and evolutionary approach. J Expert Syst Appl 40:4190–4205

    Article  Google Scholar 

  16. Domenica A, Massimo C (2001) Experiments in parallel clustering with DBSCAN. Lect Notes Comput Sci 2150:326–331

    Article  MATH  Google Scholar 

  17. Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy C-means clustering based on feature-weight learning. Pattern Recognit Lett 25:1123–1132

    Article  Google Scholar 

  18. Li XX, Meng FR, Zhou Y (2012) The fast clustering algorithm based representative points. J Nanjing Univ (Natl Sci) 48:504–512

    Google Scholar 

  19. Yeung D, Wang XZ (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Anal Mach Intell 24:556–561

    Article  Google Scholar 

  20. Pham TT, Luo JW, Hong TP, Vo B (2013) Efficient algorithm for mining sequential rules with interestingness measures. Int J Innov Comput Inf Control 9:4811–4824

    Google Scholar 

  21. IBM Almaden Research Center, Quest Data Mining Project[DB/OL] (1996-03-12) [2007-05-26]. http://www.almaden.ibm.com/cs/quest/syndata.html

  22. Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29:888–892

    Google Scholar 

  23. Wu D, Ren JD (2012) K-means sequence clustering algorithm based on top-K maximal frequent sequence patterns. Int J Adv Comput Technol 4:405–413

    Google Scholar 

  24. Wang SY, Hu YF, Fan YJ, Xu HX (2010) Cluster of data streams with mixed numeric and categorical values based on entropy and distance. J Comput Syst 31:2365–2371

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61170190), the Nature Science Foundation of Hebei Province (No. F2015402114, F2015402070, F2015402119) and Foundation of Hebei Educational Committee (No. YQ2014014). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Di Wu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, D., Ren, J. & Sheng, L. Representative points clustering algorithm based on density factor and relevant degree. Int. J. Mach. Learn. & Cyber. 8, 641–649 (2017). https://doi.org/10.1007/s13042-015-0451-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-015-0451-5

Keywords

Navigation