Abstract
Density-based algorithms have attracted many researchers due to their ability to identify clusters with arbitrary shapes in noisy datasets. DENCLUE is a density-based algorithm that clusters objects based on a density function instead of proximity measurements within data. DENCLUE is efficient in clustering high-dimensional datasets. However, it has difficulty in discovering clusters with highly varying densities. To overcome this issue, this study proposes an enhanced variant of the DENCLUE algorithm, called VDENCLUE, based on the varying Kernel Density Estimation. The VDENCLUE uses the local features of the data space, so clusters with arbitrary shapes and densities can be identified. In order to demonstrate the effectiveness of its approach, VDENCLUE was empirically evaluated and compared to the DENCLUE algorithm. Experimental results show that in almost all datasets, the VDENCLUE algorithm outperforms the DENCLUE algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Please note that the figures in this research are illustrative only and the 12 points were selected for convenience rather than randomly.
References
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. CRC Press (2013)
Akodjènou-Jeannin, M.-I., Salamatian, K., Gallinari, P.: Flexible grid-based clustering. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 350–357. Springer (2007)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability, p. 22 (1986). https://ned.ipac.caltech.edu/level5/March02/Silverman/paper.pdf
Gan, W., Li, D.: Optimal choice of parameters for a density-based clustering algorithm. In: Proceedings of the 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, RSFDGrC 2003, pp. 603–606. Springer, Heidelberg (2003). http://dl.acm.org/citation.cfm?id=1783574.1783679, ISBN 3-540-14040-9
He, J., Pan, W.: A DENCLUE based approach to neuro-fuzzy system modeling. In: 2010 2nd International Conference on Advanced Computer Control, vol. 4, pp. 42–46 (2010). https://doi.org/10.1109/ICACC.2010.5487269
Hinneburg, A., Gabriel, H.H.: Denclue 2.0: fast clustering based on kernel density estimation. In: Berthold, M.R., Shawe-Taylor, J., Lavrač, N. (eds.) Advances in Intelligent Data Analysis VII, pp. 70–80. Springer, Heidelberg (2007). ISBN 978-3-540-74825-0
Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, KDD 1998, pp. 58–65. AAAI Press (1998). http://dl.acm.org/citation.cfm?id=3000292.3000302
Huang, P., Li, X., Yuan, B.: A parallel GPU-based approach to clustering very fast data streams. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM 2015, pp. 23–32. ACM, New York (2015). https://doi.org/10.1145/2806416.2806545, ISBN 978-1-4503-3794-6
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Jin, H., Wang, S., Zhou, Q., Li, Y.: An improved method for density-based clustering. Int. J. Data Min. Model. Manag. 6(4), 347–368 (2014)
Jones, M.C.: Variable kernel density estimates and variable kernel density estimates. Aust. J. Stat. 32(3), 361–371 (1990)
Khader, M., Al-Naymat, G.: An overview of various enhancements of DENCLUE algorithm. In: Proceedings of the Second International Conference on Data Science, E-Learning and Information Systems, pp. 1–7 (2019)
Lichman, M.: UCI machine learning repository. https://archive.ics.uci.edu/ml/index.php
Müller, E., Assent, I., Günnemann, S., Seidl, T.: Scalable density-based subspace clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011. ACM, New York, pp. 1077–1086 (2011). https://doi.org/10.1145/2063576.2063733, ISBN 978-1-4503-0717-8
Nagesh, H., Goil, S., Choudhary, A.: Adaptive grids for clustering massive data sets. In: Proceedings of the 2001 SIAM International Conference on Data Mining, pp. 1–17. SIAM (2001)
Schneider, J., Vlachos, M.: Fast parameterless density-based clustering via random projections. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, CIKM 2013, pp. 861–866. ACM, New York (2013). https://doi.org/10.1145/2505515.2505590, ISBN 978-1-4503-2263-8
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Routledge (2018)
Terrell, G.R., Scott, D.W., et al.: Variable kernel density estimation. Ann. Stat. 20(3), 1236–1265 (1992)
Yu, X.-G., Jian, Y.: A new clustering algorithm based on KNN and DENCLUE. In: 2005 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 2033–2038 (2005). https://doi.org/10.1109/ICMLC.2005.1527279
Xie, C., Chang, J., Song, Y.: Hill-down strategy based density clustering and its application to medical image data. In: 2nd International ICST Conference on Scalable Information Systems, vol. 5 (2010). https://doi.org/10.4108/infoscale.2007.968
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Khader, M.S., Al-Naymat, G. (2021). VDENCLUE: An Enhanced Variant of DENCLUE Algorithm. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-55187-2_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55186-5
Online ISBN: 978-3-030-55187-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)