Abstract
A novel local density and relative distance-based spectrum clustering (LDRDSC) algorithm is proposed for multidimensional data clustering. The density spectra consider both redefined local densities and relative distances. The spectral peaks are defined as cluster centers since these peaks correspond to the local density maximums. Different clusters correspond to different spectra. The clustering by fast search and find of density peaks (CFSFDP) algorithm and several benchmark data sets are employed to validate our proposed LDRDSC algorithm. Once the density spectrum is generated, the rest points can be automatically clustered by our LDRDSC algorithm, which is different from CFSFDP. CFSFDP needs to categorize data points according to the cluster centers. Furthermore, our LDRDSC algorithm is compared with other five typical clustering algorithms (DBSCAN, FCM, AP, Mean Shift and k-means) in order to validate the effectiveness of the proposed algorithm. Computational results demonstrate that our algorithm can obtain a better clustering result than the above mentioned algorithms, especially in identifying noises or isolates.
Similar content being viewed by others
References
Frigui H, Krishnapuram R (1999) A robust competitive clustering algorithm with applications in computer vision. IEEE Trans Pattern Anal Mach Intell 21(5):450–465
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Sufisstrunk S (2012) Sliced superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Elhamifar E, Vidal R (2009) Sparse subspace clustering. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 2790–2797
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13):1658–1659
King AD, Prulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020
Huang DW, Sherman BT, Lempicki RA (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57
Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. IEEE Trans Pattern Anal Mach Intell 30(9):1632–1646
Ducournau A, Bretto A, Rital S, Laget B (2012) A reductive approach to hypergraph clustering: an application to image segmentation. Pattern Recognit 45(7):2788–2803
Chaira T (2011) A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images. Appl Soft Comput 11(2):1711–1717
Wang R, Ji W, Liu M, Wang X, Weng J, Deng S, Gao S, Yuan C (2018) Review on mining data from multiple data sources. Pattern Recognit Lett. https://doi.org/10.1016/j.patrec.2018.01.013
Wu J, Jin L, Liu M (2015) Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm. Neurocomputing 148(2):136–142
Sunita AR, Jalal Anand S, Kumar JM (2010) A density based algorithm for discovering density varied clusters in large spatial databases. Int J Comput Appl 3(6):1–4
Hinneburg A, Gabriel H-H (2007) DENCLUE 2.0: fast clustering based on kernel density estimation. Adv Intell Data Anal VII Lect Notes Comput Sci 4723:70–80
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, AAAI Press, Oregon, pp 226–231
Sander J, Ester M, Kriegel H, Xu X (1998) Density-based clustering in spatial data sets: the algorithm GDBSCAN and its applications. Data Min Knowl Disc 2:169–194
Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS, ordering points to identify the clustering structure. In: ACM SIGMOD international conference on management of data, pp 49–60
Xu X, Jager J, Kriegel H (1999) A fast parallel clustering algorithm for large spatial databases. Data Min Knowl Disc 3(3):263–290
Zaiane O, Lee C (2002) Clustering spatial data in the presence of obstacles: a density-based approach. In: Proceedings of the IEEE symposium on international database engineering and applications, Edmonton, Canada, pp 214–223
Dash M, Liu H, Xu X (2001) ‘\(1+1 > 2\)’: merging distance and density based clustering. In: Proceedings of the seventh international conference on database systems for advanced applications, IEEE, Hong Kong, pp 32–39
Nasibov E, Ulutagay G (2009) Robustness of density-based clustering methods with various neighborhood relations. Fuzzy Sets Syst 160(24):3601–3615
Kieu L-M, Bhaskar A, Chung E (2015) A modified density-based scanning algorithm with noise for spatial travel pattern analysis from smart card AFC data. Trans Res Part C 58:193–207
Maadi AE, Djouadi MS (2015) Using a light DBSCAN algorithm for visual surveillance of crowded traffic scenes. IETE J Res 61(3):308–320
Chen X (2015) A new clustering algorithm based on near neighbor influence. Exp Syst Appl 42:7746–7758
Nanda SJ, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38
Liu P, Zhou D, Wu N (2007) VDBSCAN: varied density based spatial clustering of application with noise. In: Proceedings of the IEEE international conference on service systems and service management, Chengdu, pp 528–531
Hinneburg A, Keim D (1998) An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of the fourth international conference on knowledge discovery and data mining, New York, pp 58–65
Ma D, Zhan A (2004) An adaptive density-based clustering algorithm for spatial database with noise. In: Proceedings of the fourth IEEE international conference on data mining, Brighton, UK, pp 467–470
Gupta G, Liu A, Ghosh J (2010) Automated hierarchical density shaving: a robust automated clustering and visualization framework for large biological data sets. IEEE/ACM Trans Comput Biol Bioinform 7(2):223–237
Huang J, Sun H, Song Q, Deng H, Han J (2013) Revealing density-based clustering structure from the core-connected tree of a network. IEEE Knowl Data Eng 25(8):1876
Li X, Ceikute V, Jensen CS, Tan K-L (2013) Effective online group discovery in trajectory databases. IEEE Knowl Data Eng 25(12):2752
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496
Yu D, Ma X, Tu Y, Lai L (2015) Both piston-like and rotational motions are present in bacterial chemoreceptor signaling. Scientific Reports. 5, 8640, 02 March 2015
Chen Y-W, Lai D-H, Qi H, Wang J-L, Du J-X (2015) A new method to estimate ages of facial image for large database. Multimed Tools Appl 75:2877. https://doi.org/10.1007/s11042-015-2485-9
Kumar P, Srinivasan B, Mohapatra NR (2015) Fast and accurate lithography simulation using cluster analysis in resist model building. J Micro/Nanolith MEMS MOEMS 14(2):023506
Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318. https://doi.org/10.1007/s00500-008-0323-y
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Log Soft Comput 17(2–3):255–287
Acknowledgements
This work was supported by the National Natural Foundation of Science, China (41274109), the Innovative Team Project of Sichuan Province (2015TD0020) and the New Zealand Marsden Fund.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, M., He, M., Wang, R. et al. A new local density and relative distance based spectrum clustering. Knowl Inf Syst 61, 965–985 (2019). https://doi.org/10.1007/s10115-018-1316-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1316-5