Abstract
Rough fuzzy K-means (RFKM) decomposes data into clusters using partial memberships by underlying structure of incomplete information, which emphasizes the uncertainty of objects located in cluster boundary. In this scheme, the settings of cluster boundary merely depend on subjective judgment of perceptual experience. When confronted with the data exhibiting heavily overlap and imbalance, the boundary regions obtained by existing empirical schemes vary greatly accompanied by skewing of cluster center, which exerts considerable influence on the accuracy and stability of RFKM. This paper seeks to analyze and address this deficiency and then proposes an improved rough fuzzy K-means clustering based on parametric decision-theoretic shadowed set (RFKM-DTSS). Three-way approximation is implemented by incorporating a novel fuzzy entropy into the decision-theoretic shadowed set, which rationalizes cluster boundary through minimizing fuzzy entropy loss. Under the secondary adjustment method and improved update strategy of cluster center, the proposed RFKM-DTSS is thus featured by a powerful processing ability on class overlap and imbalance commonly seen in scenarios, such as fault detection and medical diagnosis with unclear decision boundaries. The effectiveness and robustness of the RFKM-DTSS are verified by the results of comparative experiments, demonstrating the superiority of the proposed algorithm.
Access this article
Rent this article via DeepDyve
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs40815-024-01700-8/MediaObjects/40815_2024_1700_Fig15_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets used and analysed during the current study are available in the UCI Machine Learning Repository.
References
Han, J., Kamber, M.: Data Mining, Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers, San Francisco (2011)
Chanmee, S., Kesorn, K.: Semantic data mining in the information age: a systematic review. Int. J. Intell. Syst. 36, 3880–3916 (2021)
Li, L., Wang, X., Liu, Z.: A novel intuitionistic fuzzy clustering algorithm based on feature selection for multiple object tracking. Int. J. Fuzzy Syst. 21, 1613–1628 (2019)
Wang, Y., Qin, Q., Zhou, J.: Guided filter-based fuzzy clustering for general data analysis. Int. J. Fuzzy Syst. 25, 2036–2051 (2023)
Zhang, T., Chen, L., Ma, F.: A modified rough c-means clustering algorithm based on hybrid imbalanced measure of distance and density. Int. J. Approx. Reason. 55, 1805–1818 (2014)
Khameneh, A.Z., Kilicman, A., Ali, F.M.: Transitive fuzzy similarity multigraph-based model for alternative clustering in multi-criteria group decision-making problems. Int. J. Fuzzy Syst. 24, 2569–2590 (2022)
Gao, Y., Wang, Z., Li, H.: Gaussian collaborative fuzzy c-means clustering. Int. J. Fuzzy Syst. 23, 1–17 (2021)
Yan, M., Lin, H., Wang, Y.: A multi-stage hierarchical clustering algorithm based on centroid of tree and cut edge constraint. Inf. Sci. 557, 194–219 (2021)
Santos, J., Syed, T., Naldi, M.C.: Hierarchical density-based clustering using MapReduce. IEEE Trans. Big Data. 7, 102–114 (2021)
Atilgan, C., Tezel, B., Nasiboglu, E.: Efficient implementation and parallelization of fuzzy density based clustering. Inf. Sci. 575, 454–467 (2021)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014)
Qiu, T., Li, Y.: Fast LDP-MST: an efficient density-peak-based clustering method for large-size datasets. IEEE Trans. Knowl. Data Eng. 35, 4767 (2022)
Liu, J., Li, T., Xie, P.: Urban big data fusion based on deep learning: an overview. Inf. Fusion. 53, 123–133 (2020)
Zhang, T., Ma, F., Yue, D.: Interval type-2 fuzzy local enhancement based rough K-means clustering considering imbalanced clusters. IEEE Trans. Fuzzy Syst. 28, 1925–1939 (2020)
Mishro, P.K., Agrawal, S., Panda, R.: A novel type-2 fuzzy c-means clustering for brain MR image segmentation. IEEE Trans. Cybern. 51, 3901–3912 (2021)
Yang, X., Yu, F., Pedrycz, W.: Typical characteristic-based type-2 fuzzy c-means algorithm. IEEE Trans. Fuzzy Syst. 29, 1173–1187 (2021)
Guo, R., Lin, T., Zulvia, F.: A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis. Appl. Soft Comput. 67, 299–308 (2018)
Jin, D., Bai, X.: Distribution information based intuitionistic fuzzy clustering for infrared ship segmentation. IEEE Trans. Fuzzy Syst. 28, 1557–1571 (2020)
Lingras, P., West, C.: Interval set clustering of web users with rough K-means. J. Intell. Inform. Syst. 23, 5–16 (2004)
Peters, G.: Some refinements of rough c-means clustering. Pattern Recogn. 39, 1481–1491 (2006)
Mitra, S., Banka, H., Pedrycz, W.: Rough fuzzy collaborative clustering. IEEE Trans. Syst. Man Cybern. Part B 36, 795–805 (2006)
Maji, P., Pal, S.: RFKM: a hybrid clustering algorithm using rough and fuzzy sets. Fund. Inf. 80, 475–496 (2007)
Begum, S.A., Devi, O.M.: A rough type-2 fuzzy clustering algorithm for MR image segmentation. Int. J. Comput. Appl. 54, 4–11 (2012)
Sivaguru, M.: Performance-enhanced rough k-means clustering. Soft. Comput. 25, 1595–1616 (2021)
Peters, G.: Rough clustering utilizing the principle of indifference. Inform. Sci. 277, 358–374 (2014)
Vijaya, M.: A new initialization and performance measure for the rough k-means clustering. Soft. Comput. 24, 11605–11619 (2020)
Yao, Y.Y.: Three-way decision and granular computing. Int. J. Approx. Reason. 103, 107–123 (2018)
Zhan, J., Ye, J., Ding, W.: A novel three-way decision model based on utility theory in incomplete fuzzy decision systems. IEEE Trans. Fuzzy Syst. 30, 2210–2226 (2022)
Yang, X., Li, Y., Liu, D.: Hierarchical fuzzy rough approximations with three-way multigranularity learning. IEEE Trans. Fuzzy Syst. Fuzzy Syst. 30, 3486–3500 (2022)
Yao, Y.Y.: The geometry of three-way decision. Appl. Intell. 51, 6298–6325 (2021)
Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inform. Sci. 177, 28–40 (2007)
Zhang, P., Li, T., Wang, G.: Multi-source information fusion based on rough set theory: a review. Inf. Fusion. 68, 85–117 (2021)
Wei, W., Liang, J.: Information fusion in rough set theory: an overview. Inf. Fusion. 48, 107–118 (2019)
Pedrycz, W.: Shadowed sets: representing and processing fuzzy sets. IEEE Trans. Syst. Man Cybern. Part B 28, 103–109 (1998)
Pedrycz, W.: Interpretation of clusters in the framework of shadowed sets. Pattern Recogn. Lett. 26, 2439–2449 (2005)
Gao, M., Zhang, Q., Zhao, F.: Mean-entropy-based shadowed sets: a novel three-way approximation of fuzzy sets. Int. J. Approx. Reason. 120, 102–124 (2020)
Zhang, Q., Gao, M., Zhao, F., Wang, G.: Fuzzy-entropy-based game theoretic shadowed sets: a novel game perspective from uncertainty. IEEE Trans. Fuzzy Syst. 30, 597–609 (2022)
Acknowledgements
The authors would like to thank the editors and anonymous referees for their helpful suggestions in the improvement of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
(1) If El(ae | x) ≤ El(as↓ | x), then perform action ae and the value of uin is elevated to 1.
(a) \(\left( {\psi - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }}} \right) \le 0\), due to the condition \(\frac{{\mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) - \mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)}}{2}\) < δ1* < \(\frac{{\mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)}}{2}\), it is obvious that \(\left( {\psi - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }}} \right) \ge 0\), so the inequality (a) does not hold.
(b) \(2\psi \times \frac{{\left( {u_{in} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }} - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }} - \psi \ge 0,\)
(2) If El(as↓ | x) ≤ El(ae | x), then perform action as↓ and the value of uA(x) is reduced to δ*.
(a) \(\left( {\psi - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }}} \right) \ge 0\), due to the condition \(\frac{{\mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) - \mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)}}{2}\) < δ1* < \(\frac{{\mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)}}{2}\), it is obvious that \(\left( {\psi - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }}} \right) \ge 0\), so the inequality (a) holds.
(b) \(2\psi \times \frac{{\left( {u_{in} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }} - \psi \times \frac{{\left( {\delta_{1}^{*} - center\left( {u_{ij} } \right)} \right)^{{2}} }}{{center\left( {u_{ij} } \right)^{2} }} - \psi \le 0,\)
Suppose the condition \(\mu_{in} \ge \delta_{1}^{*}\), it can be seen that, if \(\mu_{in} \ge \frac{{\left[ {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)} \right] + \sqrt {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} - 2\delta_{1}^{*} \delta_{2}^{*} } }}{2}\), a decision action ae is performed; if \(\mu_{in} \le \frac{{\left[ {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)} \right] + \sqrt {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} - 2\delta_{1}^{*} \delta_{2}^{*} } }}{2}\), a decision action as↓ is performed. Thus, the threshold α can be obtained \(\alpha = \frac{{\left[ {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)} \right] + \sqrt {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} - 2\delta_{1}^{*} \delta_{2}^{*} } }}{2}\).
(3) If El(as↑ | x) ≤ El(ar | x), then perform action as↑ and the value of uA(x) is elevated to δ*.
(4) If El(ar | x) ≤ El(as↑ | x), then perform action ar and the value of uA(x) is reduced to 0.
The decision rules (3) and (4) can be proved similar to (1) and (2). Similarly, the threshold β can be obtained, and the value is \(\beta = \frac{{\left[ {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right) + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)} \right] - \sqrt {\mathop {min}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} + \mathop {max}\limits_{{x_{j} \in \hat{C}_{i} }} \left( {u_{ij} } \right)^{2} - 2\delta_{1}^{*} \delta_{2}^{*} } }}{2}.\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Y., Zhang, T., Peng, C. et al. Rough Fuzzy K-Means Clustering Based on Parametric Decision-Theoretic Shadowed Set with Three-Way Approximation. Int. J. Fuzzy Syst. 26, 1698–1715 (2024). https://doi.org/10.1007/s40815-024-01700-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-024-01700-8