Abstract
Existing clustering methods rely on prior knowledge of the data set to cluster it, so the quality of the clustering effect is entirely dependent on the user’s familiarity with the data set. Furthermore, when extracting the information from a data set, existing clustering algorithms frequently ignore the geometric distribution of data, making it difficult to identify data objects in their entirety and detect local spatial structures. To address these issues, this paper proposes a spatial subcluster clustering method by grid-connection, which automatically obtains subclusters by iterative local labeling without requiring a priori knowledge of the data set and efficiently extracts correlations between data by establishing relationships between subclusters by grid-connecting. Experiments are conducted to validate the proposed algorithm against existing state-of-the-art algorithms on 9 synthetic and 4 real data sets. The results show SSCG can efficiently utilize the information on the grid space without relying on a priori knowledge, and the overall performance is better than the existing advanced algorithms.
Y. Zhang and X. Han—Equally contributed to this work
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pp. 94–105 (1998)
Bacciu, D., Barsocchi, P., Chessa, S., Gallicchio, C., Micheli, A.: An experimental characterization of reservoir computing in ambient assisted living applications. Neural Comput. Appl. 24(6), 1451–1464 (2014)
Bai, L., Liang, J., Cao, F.: A multiple k-means clustering ensemble algorithm to find nonlinearly separable clusters. Inform. Fusion 61, 36–47 (2020)
Brown, D., Japa, A., Shi, Y.: A fast density-grid based clustering method. In: 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0048–0054. IEEE (2019)
Du, G., Zhou, L., Yang, Y., Lü, K., Wang, L.: Deep multiple auto-encoder-based multi-view clustering. Data Sci. Eng. 6(3), 323–338 (2021)
Ghahramani, Z.: Unsupervised learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML -2003. LNCS (LNAI), vol. 3176, pp. 72–112. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_5
Gui, Z., Peng, D., Wu, H., Long, X.: Msgc: multi-scale grid clustering by fusing analytical granularity and visual cognition for detecting hierarchical spatial patterns. Future Gener. Comput. Syst. 112, 1038–1056 (2020)
Li, H., Liu, X., Li, T., Gan, R.: A novel density-based clustering algorithm using nearest neighbor graph. Patt. Recogn. 102, 107206 (2020)
Ma, E.W., Chow, T.W.: A new shifting grid clustering algorithm. Patt. Recogn. 37(3), 503–514 (2004)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. vol. 1, pp. 281–297. Oakland, CA, USA (1967)
Mautz, D., Plant, C., Böhm, C.: Deepect: the deep embedded cluster tree. Data Sci. Eng. 5(4), 419–432 (2020)
Powers, D.M.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061 (2020)
Sarfraz, S., Sharma, V., Stiefelhagen, R.: Efficient parameter-free clustering using first neighbor relations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2019)
Sarle, W.S.: Algorithms for clustering data (1990)
Tao, Z., Liu, H., Li, S., Fu, Y.: Robust spectral ensemble clustering. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 367–376 (2016)
Tu, L., Chen, Y.: Stream data clustering based on grid density and attraction. ACM Trans. Knowl. Disc. Data (TKDD) 3(3), 1–27 (2009)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
Wu, B., Wilamowski, B.M.: A fast density and grid based clustering method for data with arbitrary shapes and noise. IEEE Trans. Indust. Inform. 13(4), 1620–1628 (2016)
Yue, L., Zuo, W., Peng, T., Wang, Y., Han, X.: A fuzzy document clustering approach based on domain-specified ontology. Data Knowl. Eng. 100, 148–166 (2015)
Zarikas, V., Poulopoulos, S.G., Gareiou, Z., Zervas, E.: Clustering analysis of countries using the covid-19 cases dataset. Data Brief 31, 105787 (2020)
Zhu, Q., Pei, J., Liu, X., Zhou, Z.: Analyzing commercial aircraft fuel consumption during descent: a case study using an improved k-means clustering algorithm. J. Cleaner Prod. 223, 869–882 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Y., Han, X., Wang, L., Chen, W., Guo, L. (2023). SSCG: Spatial Subcluster Clustering Method by Grid-Connection. In: Li, B., Yue, L., Tao, C., Han, X., Calvanese, D., Amagasa, T. (eds) Web and Big Data. APWeb-WAIM 2022. Lecture Notes in Computer Science, vol 13422. Springer, Cham. https://doi.org/10.1007/978-3-031-25198-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-25198-6_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25197-9
Online ISBN: 978-3-031-25198-6
eBook Packages: Computer ScienceComputer Science (R0)