Abstract
As one of the essential topics in ensemble learning, a clustering ensemble is employed to aggregate multiple base patterns to generate a single clustering output for improving robustness and quality. In this work, we proposed a novel clustering ensemble method, a shadowed set-based multi-granular three-way clustering ensemble (S-M3WCE). In particular, the approach generated a set of clustering members via the possibilistic C-means clustering (PCM) approach. Then all objects initially are partitioned into three regions by shadowed sets: the core region, shadowed region, and exclusion region, according to their possibilistic membership degrees. The procedure will capture the uncertainty and noisy objects in the data set through multiple different clustering results. Second, objects are further assigned to four approximate regions borrowed from the idea of multi-granularity rough sets by analyzing the uncertainty between objects and clusters. Objects in different approximation regions have diverse importance to clusters, and there has a partially ordered relationship between different approximation regions. Finally, we again handle the above four regions using the shadowed set, which eventually produces the output of the three-way clustering. The proposed method is evaluated using four artificial data sets and eight UCI data sets based on three evaluation criteria: clustering accuracy, adjusted rand index, and normalized mutual information. The experimental results show that the proposed algorithm achieves optimal effectiveness and efficiency against the other six representative clustering ensemble algorithms.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
All data used during the study are available in a repository or online in accordance with funder data retention policies (https://archive.ics.uci.edu/ml/datasets.php, http://cs.uef.fi/sipu/datasets/).
References
Bagherinia A, Minaei-Bidgoli B, Hosseinzadeh M, Parvin H (2021) Reliability-based fuzzy clustering ensemble. Fuzzy Sets Syst 413:1–28
Li F, Qian Y, Wang J, Dang C, Jing L (2019) Clustering ensemble based on sample’s stability. Artif Intell 273:37–55
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881
Yousefnezhad M, Huang S, Zhang D (2017) Woce: a framework for clustering ensemble by exploiting the wisdom of crowds theory. IEEE Trans Cybern 48(2):486–499
Zhou Z, Tang W (2006) Clusterer ensemble. Knowl-Based Syst 19(1):77–83
Hu J, Li T, Luo C, Fujita H, Yang Y (2017) Incremental fuzzy cluster ensemble learning based on rough set theory. Knowl-Based Syst 132:144–155
Zhong C, Hu L, Yue X, Luo T, Fu Q, Xu H (2019) Ensemble clustering based on evidence extracted from the co-association matrix. Pattern Recogn 92:93–106
Fred AL, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
Wang X, Yang C, Zhou J (2009) Clustering aggregation by probability accumulation. Pattern Recogn 42(5):668–675
Bargiela A, Pedrycz W (2008) Toward a theory of granular computing for human-centered information processing. IEEE Trans Fuzzy Syst 16(2):320–330
Yao J, Vasilakos AV, Pedrycz W (2013) Granular computing: perspectives and challenges. IEEE Trans Cybern 43(6):1977–1989
Yao Y (2016) A triarchic theory of granular computing. Granul Comput 1(2):145–157
Yao Y (2018) Three-way decision and granular computing. Int J Approx Reason 103:107–123
Cheng H, Qian Y, Wu Y, Guo Q, Li Y (2019) Diversity-induced fuzzy clustering. Int J Approx Reason 106:89–106
Pinheiro DN, Aloise D, Blanchard SJ (2020) Convex fuzzy k-medoids clustering. Fuzzy Sets Syst 389:66–92
Kumar P, Krishna PR, Bapi RS, De SK (2007) Rough clustering of sequential data. Data Knowl Eng 63(2):183–199
Lingras P, West C (2004) Interval set clustering of web users with rough k-means. J Intell Inf Syst 23(1):5–16
Yu H (2017) A framework of three-way cluster analysis. In: International Joint Conference on Rough Sets. Springer, pp 300–312
Yu H, Chang Z, Wang G, Chen X (2020) An efficient three-way clustering algorithm based on gravitational search. Int J Mach Learn Cybern 11(5):1003–1016
Yu H, Zhang C, Wang G (2016) A tree-based incremental overlapping clustering method using the three-way decision theory. Knowl-Based Syst 91:189–203
Maji P, Pal SK (2007) Rfcm: a hybrid clustering algorithm using rough and fuzzy sets. Fund Inform 80(4):475–496
Mitra S, Banka H, Pedrycz W (2006) Rough-fuzzy collaborative clustering. IEEE Trans Syst Man CybernPart B (Cybernetics) 36(4):795–805
Zhou J, Lai Z, Gao C, Miao D, Yue X (2018) Rough possibilistic c-means clustering based on multigranulation approximation regions and shadowed sets. Knowl-Based Syst 160:144–166
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110
Deng X, Yao Y (2014) Decision-theoretic three-way approximations of fuzzy sets. Inf Sci 279:702–715
Yao Y, Wang S, Deng X (2017) Constructing shadowed sets and three-way approximations of fuzzy sets. Inf Sci 412:132–153
Mitra S, Pedrycz W, Barman B (2010) Shadowed c-means: integrating fuzzy and rough clustering. Pattern Recogn 43(4):1282–1291
Zhou J, Lai Z, Miao D, Gao C, Yue X (2020) Multigranulation rough-fuzzy clustering based on shadowed sets. Inf Sci 507:553–573
Yu H, Wang Y (2012) Three-way decisions method for overlapping clustering. In: International conference on rough sets and current trends in computing. Springer, pp 277–286
Yao Y (2009) Three-way decision: an interpretation of rules in rough set theory. In: International conference on rough sets and knowledge technology. Springer, pp 642–649
Yao Y (2010) Three-way decisions with probabilistic rough sets. Inf Sci 180(3):341–353
Yao Y (2012) An outline of a theory of three-way decisions. In: International conference on rough sets and current trends in computing. Springer, pp 1–17
Xu L, Ding S (2021) A novel clustering ensemble model based on granular computing. Appl Intell 51(8):5474–5488
Yu H, Zhou Q (2013) A cluster ensemble framework based on three-way decisions. In: International conference on rough sets and knowledge technology. Springer, pp 302–312
Jiang C, Zhao S (2021) Multi-granulation three-way clustering ensemble based on shadowed sets. Acta Electron Sin 49(8):1524–1532
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4
Sandes NC, Coelho AL (2018) Clustering ensembles: a hedonic game theoretical approach. Pattern Recogn 81:95–111
Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International conference on information technology: coding and computing. Proceedings. ITCC 2004., vol 2, IEEE, pp 188–192
Fern X, Brodley C (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 186–193
Saha I, Sarkar JP, Maulik U (2015) Ensemble based rough fuzzy clustering for categorical data. Knowl-Based Syst 77:114–127
Pedrycz W (1998) Shadowed sets: representing and processing fuzzy sets. IEEE Trans Syst Man Cybern Part B (Cybernetics) 28(1):103–109
Pedrycz W (2009) From fuzzy sets to shadowed sets: interpretation and computing. Int J Intell Syst 24(1):48–61
Yue X, Zhou J, Yao Y, Miao D (2020) Shadowed neighborhoods based on fuzzy rough transformation for three-way classification. IEEE Trans Fuzzy Syst 28(5):978–991
Hu Q, Yu D, Xie Z (2008) Neighborhood classifiers. Expert Syst Appl 34(2):866–876
Zhang Q, Chen Y, Yang J, Wang G (2019) Fuzzy entropy: a more comprehensible perspective for interval shadowed sets of fuzzy sets. IEEE Trans Fuzzy Syst 20:20
Zhou J, Pedrycz W, Miao D (2011) Shadowed sets in the characterization of rough-fuzzy clustering. Pattern Recogn 44(8):1738–1749
Jiang C, Yao Y (2018) Effectiveness measures in movement-based three-way decisions. Knowl-Based Syst 160:136–143
Yao Y (2016) Three-way decisions and cognitive computing. Cogn Comput 8(4):543–554
Yao Y (2019) Tri-level thinking: models of three-way decision. Int J Mach Learn Cybern 20:1–13
Yao Y (2020) Set-theoretic models of three-way decision. Granul Comput 20:1–16
Yang B, Li J (2020) Complex network analysis of three-way decision researches. Int J Mach Learn Cybern 20:15
Li J, Huang C, Qi J, Qian Y, Liu W (2017) Three-way cognitive concept learning via multi-granularity. Inf Sci 378:244–263
Wang X, Li J (2018) Three-way decisions, concept lattice and granular computing
Savchenko AV (2019) Sequential three-way decisions in multi-category image recognition with deep features based on distance factor. Inf Sci 489:18–36
Chen J, Chen Y, He Y, Xu Y, Zhao S, Zhang Y (2021) A classified feature representation three-way decision model for sentiment analysis. Appl Intell 20:1–13
Zhang X, Fan Y, Chen S, Tang L, Lv Z (2021) Classification-level and class-level complement information measures based on neighborhood decision systems. Cogn Comput 20:1–26
Gao C, Yao Y (2017) Actionable strategies in three-way decisions. Knowl-Based Syst 133:141–155
Yu H, Wang X, Wang G, Zeng X (2020) An active three-way clustering method via low-rank matrices for multi-view data. Inf Sci 507:823–839
Wang P, Yao Y (2018) Ce3: a three-way clustering method based on mathematical morphology. Knowl-Based Syst 155:54–65
Jiang C, Duan Y, Yao J (2019) Resource-utilization-aware task scheduling in cloud platform using three-way clustering. J Intell Fuzzy Syst 37(4):5297–5305
Afridi MK, Azam N, Yao J, Alanazi E (2018) A three-way clustering approach for handling missing data using gtrs. Int J Approx Reason 98:11–24
Yu H, Chen Y, Lingras P, Wang G (2019) A three-way cluster ensemble approach for large-scale data. Int J Approx Reason 115:32–49
Zhang Y, Yao J (2020) Game theoretic approach to shadowed sets: a three-way tradeoff perspective. Inf Sci 507:540–552
Qian Y, Liang J, Yao Y, Dang C (2010) Mgrs: a multi-granulation rough set. Inf Sci 180(6):949–970
Huang J, Nie F, Huang H, Ding C (2014) Robust manifold nonnegative matrix factorization. ACM Trans Knowl Discov Data 8(3):1–21
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280
Fränti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recogn 39(5):761–775
Ayad H, Kamel M (2003) Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors. In: International workshop on multiple classifier systems. Springer, pp 166–175
Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Acknowledgements
This work was supported in part by the Natural Science Foundation of Heilongjiang Province (LH2020F031).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, C., Li, Z. & Yao, J. A shadowed set-based three-way clustering ensemble approach. Int. J. Mach. Learn. & Cyber. 13, 2545–2558 (2022). https://doi.org/10.1007/s13042-022-01543-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01543-5