Representative points clustering algorithm based on density factor and relevant degree

Wu, Di; Ren, Jiadong; Sheng, Long

doi:10.1007/s13042-015-0451-5

Representative points clustering algorithm based on density factor and relevant degree

Original Article
Published: 18 November 2015

Volume 8, pages 641–649, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Di Wu¹,
Jiadong Ren² &
Long Sheng¹

425 Accesses
5 Citations
Explore all metrics

Abstract

Most of the existing clustering algorithms are affected seriously by noise data and high cost of time. In this paper, on the basis of CURE algorithm, a representative points clustering algorithm based on density factor and relevant degree called RPCDR is proposed. The definition of density factor and relevant degree are presented. The primary representative point whose density factor is less than the prescribed threshold will be deleted directly. New representative points can be reselected from non representative points in corresponding cluster. Moreover, the representative points of each cluster are modeled by using K-nearest neighbor method. Relevant degree is computed by comprehensive considering the correlations of objects within a cluster and between different clusters. And then whether the two clusters need to merge is judged. The theoretic experimental results and analysis prove that RPCDR has better clustering accuracy and execution efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Article Open access 06 November 2019

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

Article 09 February 2021

Clustering graph data: the roadmap to spectral techniques

Article Open access 22 January 2024

References

Hou SZ, Zhang XF (2008) Analysis and research for network management alarms correlation based on sequence clustering algorithm. In: Proceedings of the 2008 international conference on intelligent computation technology and automation, pp 982–986
Mishra R, Kumar P, Bhasker B (2014) An alternative approach for clustering web user sessions considering sequential information. J Intell Data Anal 18:137–156
Google Scholar
Sharif MA, Raghavan VV (2014) A clustering based scalable hybrid approach for web page recommendation. In: Proceedings of 2014 IEEE international conference on big data, pp 80–87
Sheu TL, Lin YH (2014) A cluster-based TDMA system for inter-vehicle communications. J Inf Sci Eng 30:213–231
Google Scholar
Pichara K, Soto A (2011) Active learning and subspace clustering for anomaly detection. J Intell Data Anal 15:151–171
Google Scholar
Guha S, Rastogi R, Shim K (2001) CURE: an efficient clustering algorithm for large databases. J Inf Syst 26:35–58
Article MATH Google Scholar
Zhang JJ, Peng YW, Li HF (2013) A new semiparametric estimation method for accelerated hazards mixture cure model. J Comput Stat Data Anal 59:95–102
Article MathSciNet Google Scholar
Wang XJ, Shen H (2009) Clustering high dimensional data streams with representative points. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery, pp 449–453
DelibasiC B, VukiCeviC M, JovanoviC M, Kirchner K (2012) An architecture for component-based design of representative-based blustering algorithms. J Data Knowl Eng 75:78–98
Article Google Scholar
Cesmeci D, Gullu MK (2009) Phase-correlation-based hyperspectral image classification using multiple class representatives obtained with K-means clustering. Int J Remote Sens 30:3827–3834
Article Google Scholar
Pang YJ, Pan W, Liu KD (2010) A supervised clustering algorithm based on representative points and its application to fault diagnosis of diesel engine. J Adv Mater Res 121–122:958–963
Article Google Scholar
Chen EH, Wang SF, Yan N, Wang XF (2001) The design and implementation of clustering algorithm using representative data. J Pattern Recognit Artif Intell 14:417–422
Google Scholar
Huang TQ, Qin XL, Wang JD (2006) Multi-representation feature tree and spatial clustering algorithm. J Comput Sci 33:189–195
Google Scholar
Jia RY, Geng JW, Ning ZZ, He CG (2010) Fast clustering algorithm based on representative points. J Comput Eng Appl 46:121–126
Google Scholar
Arajo D, Neto AD (2013) Information-theoretic clustering: a representative and evolutionary approach. J Expert Syst Appl 40:4190–4205
Article Google Scholar
Domenica A, Massimo C (2001) Experiments in parallel clustering with DBSCAN. Lect Notes Comput Sci 2150:326–331
Article MATH Google Scholar
Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy C-means clustering based on feature-weight learning. Pattern Recognit Lett 25:1123–1132
Article Google Scholar
Li XX, Meng FR, Zhou Y (2012) The fast clustering algorithm based representative points. J Nanjing Univ (Natl Sci) 48:504–512
Google Scholar
Yeung D, Wang XZ (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Anal Mach Intell 24:556–561
Article Google Scholar
Pham TT, Luo JW, Hong TP, Vo B (2013) Efficient algorithm for mining sequential rules with interestingness measures. Int J Innov Comput Inf Control 9:4811–4824
Google Scholar
IBM Almaden Research Center, Quest Data Mining Project[DB/OL] (1996-03-12) [2007-05-26]. http://www.almaden.ibm.com/cs/quest/syndata.html
Xie JY, Guo WJ, Xie WX, Gao XB (2012) K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space. J Appl Res Comput 29:888–892
Google Scholar
Wu D, Ren JD (2012) K-means sequence clustering algorithm based on top-K maximal frequent sequence patterns. Int J Adv Comput Technol 4:405–413
Google Scholar
Wang SY, Hu YF, Fan YJ, Xu HX (2010) Cluster of data streams with mixed numeric and categorical values based on entropy and distance. J Comput Syst 31:2365–2371
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61170190), the Nature Science Foundation of Hebei Province (No. F2015402114, F2015402070, F2015402119) and Foundation of Hebei Educational Committee (No. YQ2014014). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Hebei University of Engineering, Handan, Hebei, China
Di Wu & Long Sheng
Yanshan University, Qinghuangdao, Hebei, China
Jiadong Ren

Authors

Di Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiadong Ren
View author publications
You can also search for this author in PubMed Google Scholar
Long Sheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Di Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, D., Ren, J. & Sheng, L. Representative points clustering algorithm based on density factor and relevant degree. Int. J. Mach. Learn. & Cyber. 8, 641–649 (2017). https://doi.org/10.1007/s13042-015-0451-5

Download citation

Received: 06 February 2015
Accepted: 22 October 2015
Published: 18 November 2015
Issue Date: April 2017
DOI: https://doi.org/10.1007/s13042-015-0451-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Representative points clustering algorithm based on density factor and relevant degree

Abstract

Access this article

Similar content being viewed by others

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

Clustering graph data: the roadmap to spectral techniques

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Representative points clustering algorithm based on density factor and relevant degree

Abstract

Access this article

Similar content being viewed by others

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

Clustering graph data: the roadmap to spectral techniques

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation