An enhanced visual approach for accessing the clustering tendency of big data

Chinnaiah, Veluru; Yadav, B. V. RamNaresh

doi:10.1007/s10619-021-07330-5

An enhanced visual approach for accessing the clustering tendency of big data

Published: 15 March 2021

Volume 41, pages 21–36, (2023)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Veluru Chinnaiah¹ &
B. V. RamNaresh Yadav²

195 Accesses
Explore all metrics

Abstract

Cluster analysis aims to create the groups for the data objects based on the assessment of similarity features. It is an essential unsupervised technique for the unlabelled datasets. For example, data clustering methods' primary problem is that k-means suffer from the intractable assignment of 'k' value by external interference (or user). Finding the number of clusters 'k' is called a clustering tendency. Existing visual approaches, i.e., visual access tendency (VAT), cosine-based VAT (cVAT), cosine-based spectral VAT(CS-VAT), are suitable for determining the value of cluster tendency of regular data. The Clustering using Improved Visual Assessment of Tendency (ClusiVAT) performs as the best for significant data clustering than other visual approaches. It uses the sampling technique for faster results; however, it perfectly works for Gaussian-based generated datasets. Thus, the proposed work develops the enhanced visual approaches for obtaining the quality of clusters for the typical datasets. Performance of enhanced visual approaches is demonstrated in the experimental study using benchmarked datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)
Article Google Scholar
Tariq, A., Foroosh, H.: T-clustering: Image clustering by tensor decomposition. In: 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, 2015, pp. 4803–4807
Rajendra Prasad, K., Suleman Basha, M.: Improving the performance of speech clustering method. In: IEEE—10th International Conference on Intelligent Systems and Control (ISCO) (2016).
Mahmud, M.S., Huang, J.Z., Salloum, S., Emara, T.Z., Sadatdiynov, K.: A survey of data partitioning and sampling methods to support big data analysis. Big Data Mining Anal. 3(2), 85–101 (2020)
Article Google Scholar
Sculley, D.: Web-scale k-means clustering. In: Proc. 19th Int. Conf. World Wide Web, pp. 1177–1178 (2020)
Bezdek, J.C., Hathaway, R.J.: “VAT: a tool for visual assessment of (cluster) tendency”. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02, pp 2225–2230 (2002)
Rajendra Prasad, K., Mohammed, M., Noorullah, R.M.: Visual topic models for healthcare data clustering. Evolutionary Intelligence (2019)
S. Singh, R. Srivastava, V. Kumar and S. Agarwal, "An approximate algorithm for degree constraint minimum spanning tree," 2010 International Conference on Computer and Communication Technology (ICCCT), Allahabad, Uttar Pradesh, 2010, pp. 687–692
Kumar, D., Bezdek, J.C., Palaniswami, M., Rajasegarar, S., Leckie, C., Havens, T.C.: A hybrid approach to clustering in big data. IEEE Trans Cybern 46(10), 2372–2385 (2016)
Article Google Scholar
Kumar, D., Palaniswami, M., Rajasegarar, S., Leckie, C., Bezdek, J.C., Havens, T.C.: clusiVAT: a mixed visual/numerical clustering algorithm for big data. In: 2013 IEEE International Conference on Big Data, Silicon Valley, CA, 2013, pp. 112–117.
Hitendra Sarma, T., Viswanath, P., Eswara Reddy, B.: Single pass kernel k-means clustering method. Sadhan 38(3), 407–419 (2013)
Article Google Scholar
Rousseeuw, P.J., Kaufman, L.: Finding Groups in Data. Wiley, Hoboken (1990)
MATH Google Scholar
L. Fang and O. C. Au, "Subpixel-based down-sampling via Min-Max Directional Error," Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, 2010, pp. 3641–3644.
Upendar Penmetcha, K. Rajendra Prasad, Visual Social Data Clusters for Effective Topics Tendency with Hybrid Machine Learning Techniques, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277–3878, Volume-8 Issue-5, January 2020
Liang Wang, J.C. Bezdek, C., Leckie, Ramamohanarao, K.: Enhanced visual analysis for cluster tendency assessment and data partitioning. IEEE Trans Knowl. Data Eng. 22(10)
Asuncion, A., Newman, D.: UCI machine learning repository. Irvine, CA: University of California, Department of Information and Computer Science, 2007. [Online]. Available: http:// www.ics.uci.edu/~mlearn/MLRepository.html
LeCun, Y., Cortes, C., Burges, C.J.: “The MNIST dataset of handwritten digits,” 1998. [Online]. Available: http://yann.lecun.com/exdb/mnist.lecun.com/exdb/mnist
Suleman Basha, M., Mouleeswaran, S.K., Rajendra Prasad, K.: Cluster Tendency Methods for Visualizing the Data Partitions, International Journal of Innovative Technology & Exploring Engineering (2019).
Ye, H., Yan, S., Bai, X.: Application of switching median filter in two-dimensional Otsu image segmentation. In: International Conference on Network and Information Systems for Computers (ICNISC), Shanghai, China, 2017, pp. 258–261.
Pattanodom, et al.: Clustering data with the presence of missing values by ensemble approach. In: 2016 Second Asian Conference on Defense Technology.
Amelio, A., Pizzuti, C.: Is normalized mutual information a fair measure for comparing community detection methods?. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 2015 Aug 25 (pp. 1584–1585)
Prasad, K.R., Mohammed, M., Noorullah, R.M.: Hybrid topic cluster models for social healthcare data. Int. J. Adv. Comput. Sci. Appl. 10(11), 490–506 (2019)
Google Scholar
Suleman Basha, M., Mouleeswaran, S.K., Rajendra Prasad, K.: Sampling-based visual assessment computing techniques for an efficient social data clustering. J. Supercomput. (2021). https://doi.org/10.1007/s11227-021-03618-6
Article Google Scholar
Ali Seyed Shirkhorshidi, Saeed Aghabozorgi, Teh Ying Wah, “A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data”, PLoS, Vol.10, Issue. 12, 2015, pp:1–20
Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference World Wide Web, 2010, pp. 1177–1178.
Rajendra Prasad, K., Eswara Reddy, B., Moulana Mohammed.: An effective Assessment of Cluster Tendency through Sampling based multi-viewpoints visual method. J. Ambient Intell. Hum. Comput. (2021). https://doi.org/https://doi.org/10.1007/s12652-020-02710-8
Bradley, P.S., Fayyad, U.M., Reina, C. et al.: Scaling clustering algorithms to large databases. In Proc. 4th Int. Conf. Knowl. Discovery Data Mining, 1998, pp. 9–15.

Download references

Author information

Authors and Affiliations

Department of CSE, Vijaya Engineering College, Khamman, Telengana, India
Veluru Chinnaiah
Department of CSE, JNTUH, Hyderbad, Telengana, India
B. V. RamNaresh Yadav

Authors

Veluru Chinnaiah
View author publications
You can also search for this author in PubMed Google Scholar
B. V. RamNaresh Yadav
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Veluru Chinnaiah.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chinnaiah, V., Yadav, B.V.R. An enhanced visual approach for accessing the clustering tendency of big data. Distrib Parallel Databases 41, 21–36 (2023). https://doi.org/10.1007/s10619-021-07330-5

Download citation

Accepted: 03 March 2021
Published: 15 March 2021
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10619-021-07330-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhanced visual approach for accessing the clustering tendency of big data

Abstract

Access this article

Similar content being viewed by others

An extended visual methods to perform data cluster assessment in distributed data systems

An efficient sampling-based visualization technique for big data clustering with crisp partitions

Detection of pre-cluster nano-tendency through multi-viewpoints cosine-based similarity approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhanced visual approach for accessing the clustering tendency of big data

Abstract

Access this article

Similar content being viewed by others

An extended visual methods to perform data cluster assessment in distributed data systems

An efficient sampling-based visualization technique for big data clustering with crisp partitions

Detection of pre-cluster nano-tendency through multi-viewpoints cosine-based similarity approach

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation