Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

Mau, Toan Nguyen; Huynh, Van-Nam

doi:10.1007/978-3-030-85529-1_17

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12898))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

490 Accesses
1 Citations

Abstract

Cluster analysis plays an important role in exploring the correlations in data by dividing datasets into separate clusters so that similar objects are located in the same cluster. Moreover, fuzzy cluster analysis can reveal the mixtures of clusters in datasets containing multiple distributions. Certainly, the outcome of clustering methods is approximately determined by the similarity definition. Thus, the similarity measurement is exceedingly important to the formation of fuzzy clusters. In fact, the similarity between two objects is mostly calculated by the mean of differences across multiple dimensions. However, the dissimilarity in some dimensions has little or no effect on the fuzzy clustering outcome. In this study, we explore such impacts for fuzzy clustering of data with categorical attributes. Accordingly, the impact of each attribute on each fuzzy cluster is calculated using an optimizer, and the overlapping dissimilar values are then adjusted by the corresponding weights. We propose to apply this approach to the Fk-centers clustering algorithm, and the experimental results show that our proposed method can achieve higher fuzzy silhouette scores than other related works. These results demonstrate the applicability of deploying of the proposed method in real-world application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Q., Yang, L.T., Chen, Z.: Deep computation model for unsupervised feature learning on big data. IEEE Trans. Serv. Comput. 9(1), 161–171 (2015)
Google Scholar
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)
Article Google Scholar
Huang, Z., Ng, M.K.: A fuzzy \(k\)-modes algorithm for clustering categorical Aata. IEEE Trans. Fuzz. Syst. 7(4), 446–452 (1999)
Google Scholar
Campello, R.J., Hruschka, E.R.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157(21), 2858–2875 (2006)
Article MathSciNet Google Scholar
Huang, Z.: Extensions to the \(k\)-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)
Article Google Scholar
San, O.M., Huynh, V.-N., Nakamori, Y.: An alternative extension of the \(k\)-means algorithm for clustering categorical data. Int. J. Appl. Math. Comput. Sci. 14, 241–247 (2004)
MathSciNet MATH Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, no. 14. Oakland, pp. 281–297 (1967)
Google Scholar
Kim, D.-W., Lee, K.H., Lee, D.: Fuzzy clustering of categorical data using fuzzy centroids. Patt. Recogn. Lett. 25(11), 1263–1271 (2004)
Article Google Scholar
Chen, L., Wang, S.: Central clustering of categorical data with automated feature weighting. In: IJCAI, pp. 1260–1266 (2013)
Google Scholar
Mau, T.N., Huynh, V.-N.: Kernel-based \(k\)-representatives algorithm for Fuzzy clustering categorical data. In: IEEE International Conference on Fuzzy Systems (2021, Under review)
Google Scholar
Liu, H., Wu, J., Liu, T., Tao, D., Fu, Y.: Spectral ensemble clustering via weighted “k"-means: theoretical and practical evidence. IEEET Trans. Knowl. Data Eng. 29(5), 1129–1143 (2017)
Article Google Scholar
Potdar, K., Pardawala, T.S., Pai, C.D.: A comparative study of categorical variable encoding techniques for neural network classifiers. Int. J. Comput. Appl. 175(4), 7–9 (2017)
Google Scholar
Qian, Y., Li, F., Liang, J., Liu, B., Dang, C.: Space structure and clustering of categorical data. IEEE Trans. Neural Netw. Learn. Syst. 27(10), 2047–2059 (2015)
Article MathSciNet Google Scholar
Gan, G., Wu, J., Yang, Z.: A genetic fuzzy \(k\)-modes algorithm for clustering categorical data. Exp. Syst. Appl. 36(2), 1615–1620 (2009)
Article Google Scholar
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes. IEEE Trans. Evol. Comput. 13(5), 991–1005 (2009)
Article Google Scholar
Yang, C.-L., Kuo, R., Chien, C.-H., Quyen, N.T.P.: Non-dominated sorting genetic algorithm using fuzzy membership chromosome for categorical data clustering. Appl. Soft Comput. 30, 113–122 (2015)
Google Scholar
Zhu, S., Xu, L.: Many-objective fuzzy centroids clustering algorithm for categorical data. Exp. Syst. Appl. 96, 230–248 (2018)
Article Google Scholar
Dehariya, V.K., Shrivastava, S.K., Jain, R.: Clustering of image data set using \(k\)-means and fuzzy \(k\)-means algorithms. In: 2010 International Conference on Computational Intelligence and Communication Networks, pp. 386–391. IEEE (2010)
Google Scholar
Ghosh, S., Dubey, S.K.: Comparative analysis of \(k\)-means and fuzzy \(c\)-means algorithms. Int. J. Adv. Comput. Sci. Appl. 4(4), 36 (2013)
Google Scholar
Li, Q., Racine, J.S.: Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton (2007)
Google Scholar
Lu, Y., Wang, S., Li, S., Zhou, C.: Particle swarm optimizer for variable weighting in clustering high-dimensional data. Mach. Learn. 82(1), 43–70 (2011)
Article MathSciNet Google Scholar
Frank, A., et al.: UCI machine learning repository, vol. 15, p. 22 (2011). http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Toan Nguyen Mau & Van-Nam Huynh

Authors

Toan Nguyen Mau
View author publications
You can also search for this author in PubMed Google Scholar
Van-Nam Huynh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toan Nguyen Mau .

Editor information

Editors and Affiliations

Umeå University, Umeå, Sweden
Vicenç Torra
Tamagawa University, Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mau, T.N., Huynh, VN. (2021). Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-85529-1_17
Published: 20 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics