Abstract
Covering rough sets conceptualize different types of features with their respective generated coverings. By integrating these coverings into a single covering, covering rough set-based feature selection finds valuable features from a mixed decision system with symbolic, real-valued, missing-valued and set-valued features. Existing approaches to covering rough set-based feature selection, however, are intractable to handle large mixed data. Therefore, an efficient strategy of incremental feature selection is proposed by presenting a mixed data set in sample subsets one after another. Once a new sample subset comes in, the relative discernible relation of each feature is updated to disclose incremental feature selection scheme that decides the strategies of increasing informative features and removing redundant features. The incremental scheme is applied to establish two incremental feature selection algorithms from large or dynamic mixed data sets. The first algorithm updates the feature subset upon the sequent arrival of sample subsets and returns the reduct when no further sample subsets are obtained. The second one merely updates the relative discernible relations and finds the reduct when no subsets are obtained. Extensive experiments demonstrate that the two proposed incremental algorithms, especially the second one speeds up covering rough set-based feature selection without sacrificing too much classification performance.
Similar content being viewed by others
References
Bonikowski Z, Bryniarski E, Wybraniec-Skardowska U (1998) Extensions and intentions in the rough set theory. Inf Sci 107(1):149–167
Chen DG, Yang YY (2014) Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Trans Fuzzy Syst 22(5):1325–1334
Chen DG, Wang CZ, Hu QH (2007) A new approach to attribute reduction of consistent and inconsistent covering decision systems with covering rough sets. Inf Sci 177(17):3500–3518
Chen DG, Zhao SY, Zhang L, Yang YP, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24(11):2080–2093
Chen HM, Li TR, Ruan D, Lin JH, Hu CX (2013) A rough-set-based incremental approach for updating approximations under dynamic maintenance environments. IEEE Trans Knowl Data Eng 25(2):274–284
Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2014) A rough set-based method for updating decision rules on attribute values coarsening and refining. IEEE Trans Knowl Data Eng 26(12):2886–2899
Chen DG, Li WL, Zhang X, Kwong S (2014) Evidence-theory-based numerical algorithms of attribute reduction with neighborhood-covering rough sets. Int J Approx Reason 55(3):908–923
Chen HM, Li TR, Luo C, Horng SJ, Wang GY (2015) A decision-theoretic rough set approach for dynamic data mining. IEEE Trans Fuzzy Syst 23(6):1958–1970
Chen DG, Yang YY, Dong Z (2016) An incremental algorithm for attribute reduction with variableprecision rough sets. Appl Soft Comput 45:129–149
Couso I, Dubois D (2011) Rough sets, coverings and incomplete information. Fundamenta Informaticae 108(3–4):223–247
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176
Dong Z, Sun M, Yang YY (2016) Fast algorithms of attribute reduction for covering decision systems with minimal elements in discernibility matrix. Int J Mach Learn Cybern 7(2):297–310
Du Y, Hu QH, Zhu PF, Ma PJ (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gener Syst 17(2–3):191–209
Feng SR, Zhang DZ (2012) Increment algorithm for attribute reduction based on improvement of discernibility matrix. J Shenzhen Univ Sci Eng 29(5):405–411
Grzyma la-Busse JW (2004) Characteristic relations for incomplete data: A generalization of the indiscernibility relation, In Rough Sets and Current Trends in Computing, Springer, 244-253
Guan YY, Wang HK (2006) Set-valued information systems. Inf Sci 176(17):2507–2525
Hu QH, Yu DR, Xie ZX (2006) Information-preserving hybrid data reduction based on fuzzy-rough tecniques. Pattern Recogn Lett 27(5):414–423
Hu F, Dai J, Gy Wang (2007) Incremental algorithms for attribute reduction in decision table. Control Decis 22(3):268–272
Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Hu F, Wang GY, Huang H, Wu Y (2005) Incremental attribute reduction based on elementary sets, In Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Springer, 185-193
Jensen R, Shen Q (2004) Fuzzy rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141(3):469–485
Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15(1):73–89
Jing YG, Li TR, Luo C, Horng SJ, Wang GY, Yu Z (2016) An incremental approach for attribute reduction based on knowledge granularity. Knowl-Based Syst 104:24–38
Jing YG, Li TR, Fujita H, Yu Z, Wang B (2017) An incremental attribute reduction approach based on knowledge granularity with a multi-granulation view. Inf Sci 411:23–38
Kira K, Rendell LA (1992) A practial approach to feature selection. Mach Learn Proc 48(1):249–256
Koller D, Sahami M (1996) Toward optimal feature selection, In 13th International Conference on Machine Learning, 284-292
Lang GM, Li QG, Cai MJ, Yang T (2015) Characteristic matrixes-based knowledge reduction in dynamic covering decision information systems. Knowl-Based Syst 85:1–26
Li TR, Ruan D, Geert W, Song J, Xu Y (2007) A rough sets based characteristic relation approach for dynamic attribute generalization in data mining. Knowl-Based Syst 20(5):485–494
Liang JY, Xu ZB (2002) The algorithm on knowledge reduction in incomplete information systems. Int J Uncertain, Fuzziness Knowl-Based Syst 10(1):95–103
Liang JY, Wang F, Dang CY, Qian YH (2014) A group incremental approach to feature selection applying rough set technique. IEEE Trans Knowl Data Eng 26(2):294–308
Ming Y (2007) An incremental updating algorithm for attribute reduction based on improved discernibility matrix. Ch J Comput 30(5):815–822
Orlowska ME, Orlowski MW (1992) Maintenance of knowledge in dynamic information systems, I: Intelligent Decision Support, Springer, 315–329
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356
Qian YH, Dang CY, Liang JY, Tang DW (2009) Set-valued ordered information systems. Inf Sci 179(16):2809–2832
Qian YH, Liang JY, Pedrycz W, Dang CY (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9):597–618
Qian YH, Liang JY, Song P, Dang CY, Wei W (2012) Evaluation of the decision performance of the decision rule set from an ordered decision table. Knowl-Based Syst 36:39–50
Robnik M, Ikonja I (2003) Kononenko, Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1–2):23–69
Sang BB, Chen HM, Yang L, Li TR, Xu WH Incremental Feature Selection Using a Conditional Entropy Based on Fuzzy Dominance Neighborhood Rough Sets, IEEE Trans Fuzzy Syst
Sang BB, Chen HM, Li TR, Xu WH, Yu H (2020) Incremental approaches for heterogeneous feature selection in dynamic ordered data. Inf Sci 541:475–501
Sang BB, Chen HM, Yang L, Zhou DP, Li TR, Xu WH (2021) Incremental attribute reduction approaches for ordered data with time-evolving objects. Knowl-Based Syst 212:1–18
Shu WH, Shen H (2013) A rough-set based incremental approach for updating attribute reduction under dynamic incomplete decision systems, In Fuzzy Systems (FUZZ), 2013 IEEE International Conference on, IEEE, 1-7
Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Slowinski R (ed) Intellient Decision Support-Handbook of Applications and Advances of the Rough Sets Theory. Kluwer, Dordrecht, The Netherlands, pp 331–362
Slezak D (2002) Approximate entropy reducts. Fundamenta Informaticae 53(3):365–390
Slowinski R, Vanderpooten D (2000) A generalized definition of rough approximations based on similarity. IEEE Trans Knowl Data Eng 12(2):331–336
Stefanowski J, Tsoukias A (1999) On the extension of rough sets under incomplete information, in: New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, Springer, 73-81
Tsang EC, Chen DG, Yeung DS, Wang XZ, Lee JW (2008) Attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy Syst 16(5):1130–1141
Wang GY, Yu H, Yang DC (2002) Decision table reduction based on conditional information entropy. Ch J Comput 25(7):759–766
Wang CZ, He Q, Chen DG, Hu QH (2014) A novel method for attribute reduction of covering decision systems. Inf Sci 254:181–196
Wang CZ, Shao MW, Sun BZ, Hu QH (2015) An improved attribute reduction scheme with covering based rough sets. Appl Soft Comput 26:235–243
Wang CZ, Qi YL, Shao MW et al (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753
Wang CZ, Shao MW, He Q, Qian YH, Qi YL (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl-Based Syst 111(1):173–179
Wang CZ, Hu QH, Wang XZ, Chen DG, Qian YH (2018) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Syst 29(7):2986–2999
Yang XB, Qi Y, Yu HL, Song XN, Yang JY (2014) Updating multigranulation rough approximations with increasing of granular structures. Knowl-Based Syst 64:59–69
Yang YY, Chen DG, Wang H, Tsang ECC, Zhang DL (2017) Fuzzy rough set based incremental attribute reduction from dynamic data with sample arriving. Fuzzy Sets Syst 312:66–86
Yang YY, Chen DG, Wang H (2017) Active sample selection based incremental algorithm for attribute reduction with rough sets. IEEE Trans Fuzzy Syst 25(4):825–838
Yang YY, Chen DG, Wang H, Wang XZ (2018) Incremental Perspective for Feature Selection Based on Fuzzy Rough Sets. IEEE Trans Fuzzy Syst 26(3):1257–1273
Yao YY (1998) Relational interpretations of neighborhood operators and rough set approximation operators. Inf Sci 111(1):239–259
Yao YY (2006) Neighborhood systems and approximate retrieval. Inf Sci 176(23):3431–3452
Zakowski W (1983) Approximations in the space \((u,\pi )\). Demonstr Mathematica 16(3):761–769
Zhang X, Mei CL, Chen DG, Li JH (2013) Multi-confidence rule acquisition oriented attribute reduction of covering decision systems via combinatorial optimization. Knowl-Based Syst 50:187–197
Zhang X, Mei CL, Chen DG, Li JH (2016) Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recognit 56:1-15
Zhang X, Mei CL, Chen DG, Yang YY, Li JH (2020) Active incremental feature selection using a Fuzzy-Rough-Set-based information entropy. IEEE Trans Fuzzy Syst 28(5):901–915
Acknowledgements
The authors declare that they have no conflict of interest. This paper is funded by the National Natural Science Foundation of China (grant numbers 61806108, 12171388, 12071131 and 52175493) and the Fundamental Research Funds for the Central Universities (grant number 2019RC055).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, Y., Chen, D., Zhang, X. et al. Covering rough set-based incremental feature selection for mixed decision system. Soft Comput 26, 2651–2669 (2022). https://doi.org/10.1007/s00500-021-06687-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06687-0