Abstract
In clustering analysis, data clusters are usually associated with feature subsets rather than the whole space. Therefore, soft subspace clustering devote to find the corresponding subspace for each cluster by assigning weight to the features. However, since categorical data is qualitative rather than quantitative data, the study of subspace clustering for categorical data is relatively rare and challenging. Therefore, this paper presents a new two-step subspace clustering algorithm for categorical data. Firstly, an initial clustering method is proposed to obtain a reliable initial cluster structure, based on which, the feature-to-cluster groups are constructed by utilizing intrinsic relationships between features. Subsequently, the local and global clustering are defined to learn the local cluster relations between data objects and achieve the final clustering results, respectively. Experimental results on benchmark datasets demonstrate the effectiveness of proposed method.
This work was supported by the National Natural Science Foundation of China under Grant 61806131 and the Natural Science Foundation of Guangdong Province under Grant 2018A030310510.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carbonera, J.L., Abel, M.: A subspace hierarchical clustering algorithm for categorical data. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 509–516. IEEE (2019)
Cheung, Y.M., Jia, H.: Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number. Pattern Recogn. 46(8), 2228–2238 (2013)
Jia, H., Cheung, Y.M.: Subspace clustering of categorical and numerical data with an unknown number of clusters. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3308–3325 (2017)
Kuo, R.J., Zheng, Y., Nguyen, T.P.Q.: Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering. Inf. Sci. 557, 1–15 (2021)
Oskouei, A.G., Balafar, M.A., Motamed, C.: FKMAWCW: categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning. Chaos Solitons Fract. 153, 111–494 (2021)
Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor. Newsl. 6(1), 90–105 (2004)
Peng, L., Liu, Y.: Attribute weights-based clustering centres algorithm for initialising k-modes clustering. Clust. Comput. 22, 6171–6179 (2019)
Qian, Y., Li, F., Liang, J., Liu, B., Dang, C.: Space structure and clustering of categorical data. IEEE Trans. Neural Netw. Learn. Syst. 27(10), 2047–2059 (2015)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jia, H., Dong, M. (2023). Subspace Clustering with Feature Grouping for Categorical Data. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14117. Springer, Cham. https://doi.org/10.1007/978-3-031-40283-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-40283-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40282-1
Online ISBN: 978-3-031-40283-8
eBook Packages: Computer ScienceComputer Science (R0)