Skip to main content
Log in

An interpretable neural network for robustly determining the location and number of cluster centers

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

K-means is a clustering method with an interpretable mechanism. However, its clustering results are significantly affected by the location of the initial cluster centers. More importantly, for it and its improved versions, it is extremely hard to adaptively determine the number of cluster centers. In contrast, ordinary neural networks have powerful information representation ability but lack interpretability. Moreover, to the best of our knowledge, the use of interpretable neural networks to determine the number of cluster centers of K-means is absent. This paper proposes K-meaNet that combines the interpretable mechanism of K-means and the powerful information representation ability of neural networks. For the neural network in K-meaNet, its inputs, weights, and mathematical expressions of each layer have clear meanings. During training, if one cluster center is critical, the value of one of the weights in the neural network, the gate, corresponding to this cluster center will increase. At the same time, the position of this cluster center will be close to the ideal cluster center. Besides, the location of the cluster center(s) and the value(s) of the corresponding gate(s) will not change significantly. This endows K-meaNet with the ability to adaptively determine the location and number of cluster centers compared with K-means and its improved versions. Moreover, this adaptive ability is robust to the location of the initial cluster centers, the number of the initial cluster centers, and the number of features. On six synthetic datasets and three real datasets, numerical experiments verify that K-meaNet can adaptively determine the number of cluster centers and is robust to the location of the initial cluster centers, the number of the initial cluster centers, and the number of features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

All data generated or analyzed during this study are included in this published paper.

References

  1. Ahmed M, Seraj R, Islam SMS (2020) The k-means algorithm: A comprehensive survey and performance evaluation. Electronics 9:1295

    Article  Google Scholar 

  2. “Hierarchical clustering" (2023) https://www.mathworks.com/help/stats/clusterdata.html

  3. Ester M, Kriegel H. P, Sander J, Xu X (1996)“A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd KDD. AAAI Press

  4. Dong S, Xia Y, Peng T (2021) Network abnormal traffic detection model based on semi-supervised deep reinforcement learning. IEEE Trans Netw Serv Manag 18(4):4197–4212

    Article  Google Scholar 

  5. Wang H, Cheng R, Zhou J, Tao L, Kwan HK (2022) Multistage model for robust face alignment using deep neural networks. Cogn Comput 14:1123–1139

    Article  Google Scholar 

  6. Li F, Gao D, Yang Y, Zhu J (2023) Small target deep convolution recognition algorithm based on improved YOLOv4. Int J Mach Learn Cybern 14:387–394

    Article  Google Scholar 

  7. Zhang Y, Mańdziuk J, Quek CH, Goh BW (2017) Curvature-based method for determining the number of clusters. Inf Sci 415–416:414–428

    Article  Google Scholar 

  8. Liu Q, Wu H, Xu Z (2021) Consensus model based on probability K-means clustering algorithm for large scale group decision making. Int J Mach Learn Cybern 12:1609–1626

    Article  ADS  Google Scholar 

  9. Biswas TK, Giri K, Roy S (2023) ECKM: An improved K-means clustering based on computational geometry. Expert Syst Appl 212:118862

    Article  Google Scholar 

  10. Hu H, Liu J, Zhang X, Fang M (2023) An effective and adaptable K-means algorithm for big data cluster analysis. Pattern Recognit 139:109404

    Article  Google Scholar 

  11. Liu L, Li P, Chu M, Liu S (2023) Robust nonparallel support vector machine with privileged information for pattern recognition. Int J Mach Learn Cybern 14:1465–1482

    Article  Google Scholar 

  12. Tanveer M, Gupta T, Shah M, Richhariya B (2021) Sparse twin support vector clustering using pinball loss. IEEE J Biomed Health Inf 25(10):3776–3783

    Article  CAS  Google Scholar 

  13. Tanveer M, Gupta T, Shah M (2021) Pinball loss twin support vector clustering. ACM Trans Multimed Comput Commun Appl 17(2s):1–23

    Article  Google Scholar 

  14. Tanveer M, Tabish M, Jangir J (2022) Sparse pinball twin bounded support vector clustering. IEEE Trans Comput Soc Syst 9(6):1820–1829

    Article  Google Scholar 

  15. Demuth HB, Beale MH, De Jésus O, Hagan MT (2014) Neural network design. Martin Hagan, Stillwater, Oklahoma, USA

  16. Larochelle H, Bengio Y, Louradour J, Lamblin P (2009) Exploring strategies for training deep neural networks. J Mach Learn Res 10(1):1–40

    Google Scholar 

  17. Xie X, Li Z, Pu YF, Wang J, Zhang W, Wen Y (2023) A fractional filter based on reinforcement learning for effective tracking under impulsive noise. Neurocomputing 516:155–168

    Article  Google Scholar 

  18. Liu S, Huang S, Fu W, Lin JCW (2023) A descriptive human visual cognitive strategy using graph neural network for facial expression recognition. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-022-01681-w

    Article  PubMed  Google Scholar 

  19. Jain DK, Ding W, Kotecha K (2023) Training fuzzy deep neural network with honey badger algorithm for intrusion detection in cloud environment. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-022-01758-6

    Article  Google Scholar 

  20. Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: European conference on computer vision

  21. Dang Z, Deng C, Yang X, Wei K, Huang H (2021) Nearest neighbor matching for deep clustering. In: IEEE/CVF conference on computer vision and pattern recognition

  22. Xu J, Ren Y, Li G, Pan L, Zhu C, Xu Z (2021) Deep embedded multi-view clustering with collaborative training. Inf Sci 573:279–290

    Article  MathSciNet  Google Scholar 

  23. Özgül OF, Bardak B, Tan M (2021) A convolutional deep clustering framework for gene expression time series. IEEE ACM Trans Comput Biol Bioinform 18(6):2198–2207

    Article  PubMed  Google Scholar 

  24. Cai J, Fan J, Guo W, Wang S, Zhang Y, Zhang Z (2022) Efficient deep embedded subspace clustering. In: IEEE/CVF conference on computer vision and pattern recognition

  25. Cai J, Wang S, Xu C, Guo W (2022) Unsupervised deep clustering via contractive feature representation and focal loss. Pattern Recognit 123:108386

    Article  Google Scholar 

  26. Li S, Yuan M, Chen J, Hu Z (2022) AdaDC: adaptive deep clustering for unsupervised domain adaptation in person re-identification. IEEE Trans Circuits Syst Video Technol 32(6):3825–3838

    Article  Google Scholar 

  27. Wang J, Wu B, Ren Z, Zhang H, Zhou Y (2023) Multi-scale deep multi-view subspace clustering with self-weighting fusion and structure preserving. Expert Syst Appl 213:119031

    Article  Google Scholar 

  28. Wang Y, Chang D, Fu Z, Zhao Y (2023) Learning a bi-directional discriminative representation for deep clustering. Pattern Recogn 137:109237

    Article  Google Scholar 

  29. Wang T, Zhang X, Lan L, Luo Z (2023) Local-to-global deep clustering on approximate Uniform manifold. IEEE Trans Knowl Data Eng 35(5):5035–5046

    Google Scholar 

  30. Liu Y et al (2023) Dink-net: neural clustering on large graphs. arXiv:2305.18405v3 [cs.LG]

  31. Ding F, Zhang D, Yang Y, Krovi V, Luo F (2023) Contrastive representation Disentanglement for Clustering. arXiv:2306.05439v2 [cs.LG]

  32. Castelvecchi D (2016) Can we open the black box of AI? Nat News 538(7623):20

    Article  CAS  ADS  Google Scholar 

  33. Tang Z et al (2019) Interpretable classification of Alzheimer’s disease pathologies with a convolutional neural network pipeline. Nat Commun 10(1):1–14

    MathSciNet  ADS  Google Scholar 

  34. Samek W, Montavon G, Lapuschkin S, Anders CJ, Müller KR (2021) Explaining deep neural networks and beyond: a review of methods and applications. Proc IEEE 109(3):247–278

    Article  Google Scholar 

  35. Peng X, Li Y, Tsang IW, Zhu H, Lv J, Zhou JT (2022) XAI beyond classification: interpretable neural clustering. J Mach Learn Res 23(6):1–28

    MathSciNet  Google Scholar 

  36. Yu L, Zhang Z, Xie X, Chen H, Wang J (2019) Unsupervised feature selection using RBF autoencoder. Int Symp Neural Netw 11554:48–57

    Google Scholar 

  37. Ma L, Wang X, Zhou Y (2022) Observer and command-filter-based adaptive neural network control algorithms for nonlinear multi-agent systems with input delay. Cogn Comput 14:814–827

    Article  Google Scholar 

  38. Wang K, Yan C, Yuan X, Wang Y, Liu C (2022) A reduced nonstationary discrete convolution kernel for multimode process monitoring. Int J Mach Learn Cybern 13:3711–3725

    Article  Google Scholar 

  39. Gao T, Zhang Z, Chang Q, Xie X, Ren P, Wang J (2019) Conjugate gradient-based Takagi–Sugeno fuzzy neural network parameter identification and its convergence analysis. Neurocomputing 364:168–181

    Article  Google Scholar 

  40. Wang J, Chang Q, Gao T, Zhang K, Pal NR (2022) Sensitivity analysis of Takagi–Sugeno fuzzy neural network. Inf Sci 582:725–749

    Article  Google Scholar 

  41. Xue G, Chang Q, Wang J, Zhang K, Pal NR (2023) An adaptive neuro-fuzzy system with integrated feature selection and rule extraction for high-dimensional classification problems. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2022.3220950

    Article  Google Scholar 

  42. Xue G, Wang J, Yuan B, Dai C (2023) DG-ALETSK: a high-dimensional fuzzy approach with simultaneous feature selection and rule extraction. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2023.3270445

    Article  Google Scholar 

  43. Xie X, Zhang H, Wang J, Chang Q, Wang J, Pal NR (2020) Learning optimized structure of neural networks by hidden node pruning with \(L_ {1}\) regularization. IEEE Trans Cybern 50(3):1333–1346

    Article  PubMed  Google Scholar 

  44. Dau HA et al (2019) The UCR time series archive. IEEE CAA J Autom Sin 6(6):1293–1305

    Article  Google Scholar 

  45. UCI Machine Learning Repository, School Inf. Comput. Sci., Univ. California, at Irvine, CA, USA, Accessed: 2023. [Online]. https://archive-beta.ics.uci.edu/

  46. Park HS, Jun CH (2009) A simple and fast algorithm for K-medoids clustering. Expert Syst Appl 36:3336–3341

    Article  Google Scholar 

Download references

Funding

This work was supported in part by the National Key R &D Program of China under Grant 2019YFA0708700; in part by the National Natural Science Foundation of China under Grant 62173345; and in part by the Fundamental Research Funds for the Central Universities under Grant 20CX05002A, 22CX03002A; in part by the Joint Education Project for Universities in CEE Countries and China under Grant 2022151.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yi-Fei Pu or Jian Wang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethics approval

This paper does not contain any experiments with human or animal participants performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Visualization comparisons of the clustering results on the KN2 (synthetic) dataset when the number of the initial cluster centers is 20, 30, 40, and 50

See Figs. 5, 6, 7 and 8.

Fig. 5
figure 5

The clustering results of the classic K-means and the proposed K-meaNet on the KN2 (synthetic) dataset when the number of the initial cluster centers is 20. Here, green, turquoise, and blue dots represent samples in KN2S1, KN2S2, and KN2S3, respectively. In left column, black circles represent initial cluster centers. In middle column, black circles represent cluster centers obtained by K-means. In right column, black and red circles represent cluster centers adaptively discarded and determined by K-meaNet, respectively (color figure online)

Fig. 6
figure 6

The clustering results of the classic K-means and the proposed K-meaNet on the KN2 (synthetic) dataset when the number of the initial cluster centers is 30. Here, green, turquoise, and blue dots represent samples in KN2S1, KN2S2, and KN2S3, respectively. In left column, black circles represent initial cluster centers. In middle column, black circles represent cluster centers obtained by K-means. In right column, black and red circles represent cluster centers adaptively discarded and determined by K-meaNet, respectively (color figure online)

Fig. 7
figure 7

The clustering results of the classic K-means and the proposed K-meaNet on the KN2 (synthetic) dataset when the number of the initial cluster centers is 40. Here, green, turquoise, and blue dots represent samples in KN2S1, KN2S2, and KN2S3, respectively. In left column, black circles represent initial cluster centers. In middle column, black circles represent cluster centers obtained by K-means. In right column, black and red circles represent cluster centers adaptively discarded and determined by K-meaNet, respectively (color figure online)

Fig. 8
figure 8

The clustering results of the classic K-means and the proposed K-meaNet on the KN2 (synthetic) dataset when the number of the initial cluster centers is 50. Here, green, turquoise, and blue dots represent samples in KN2S1, KN2S2, and KN2S3, respectively. In left column, black circles represent initial cluster centers. In middle column, black circles represent cluster centers obtained by K-means. In right column, black and red circles represent cluster centers adaptively discarded and determined by K-meaNet, respectively (color figure online)

Appendix 2: Non-visualization comparisons of the clustering results on the KN500 (synthetic), KN1000 (synthetic), KN2000 (synthetic), KN4000 (synthetic), Vehicle (real), and Wine (real) datasets

See Tables 6, 7, 8, 9, 10 and 11.

Table 6 The clustering results of the classic K-means and the proposed K-meaNet on the KN500 (synthetic) dataset
Table 7 The clustering results of the classic K-means and the proposed K-meaNet on the KN1000 (synthetic) dataset
Table 8 The clustering results of the classic K-means and the proposed K-meaNet on the KN2000 (synthetic) dataset
Table 9 The clustering results of the classic K-means and the proposed K-meaNet on the KN4000 (synthetic) dataset
Table 10 The clustering results of the classic K-means and the proposed K-meaNet on the Vehicle (real) dataset
Table 11 The clustering results of the classic K-means and the proposed K-meaNet on the Wine (real) dataset

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, X., Pu, YF., Zhang, H. et al. An interpretable neural network for robustly determining the location and number of cluster centers. Int. J. Mach. Learn. & Cyber. 15, 1473–1501 (2024). https://doi.org/10.1007/s13042-023-01978-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01978-4

Keywords

Navigation