Abstract
The basic solution for locating an optimal reduct is to generate all possible reducts and select the one that best meets the given criterion. Since this problem is NP-hard, most attribute reduction algorithms use heuristics to find a single reduct with the risk to overlook for the best ones. There is a discernibility function (DF)-based approach that generates all reducts but may fail due to memory overflows even for datasets with dimensionality much below the medium. In this study, we show that the main shortcoming of this approach is its excessively high space complexity. To overcome this, we first represent a DF of \(n\) attributes by a bit-matrix (BM). Second, we partition the BM into no more than \(n-1\) sub-BMs (SBMs). Third, we convert each SBM into a subset of reducts by preventing the generation of redundant products, and finally, we unite the subsets into a complete set of reducts. Among the SBMs of a BM, the most complex one is the first SBM with a space complexity not greater than the square root of that of the original BM. The proposed algorithm converts such a SBM with \(n\) attributes into the subset of reducts with the worst case space complexity of \(\left( _{n/2}^n \right) /2\).





Similar content being viewed by others
References
Komorowski J, Pawlak Z, Polkowski L, Skowron A (1999) Rough-fuzzy hybridization: rough sets: a tutorial. A new trend in decision making. Springer, Berlin
Wang XY, Yang J, Teng XL, Xia WJ, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471
Shang WQ, Huang HK, Zhu HB, Lin YM, Qu YL, Wang ZH (2007) A novel feature selection algorithm for text categorization. Expert Syst Appl 33(1):1–5
Matsumoto Y, Watada J (2009) Knowledge acquisition from time series data through rough sets analysis. Int J Innov Comput Inf Control 5(12B):4885–4897
Javed K, Babri HA, Saeed M (2012) Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Trans Knowl Data Eng 24(3):465–477
Yang SH, Hu BG (2012) Discriminative feature selection by nonparametric Bayes error minimization. IEEE Trans Knowl Data Eng 24(8):1422–1434
Zhao Z, Wang L, Liu H, Ye JP (2013) On similarity preserving feature selection. IEEE Trans Knowl Data Eng 25(3):619–632
Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. ICS PAS Report 1/91, Technical University of Warsaw, pp 1–44
Maji P, Pal SK (2010) Feature selection using f-information measures in fuzzy approximation spaces. IEEE Trans Knowl Data Eng 22(6):854–867
Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471
Thangavel K, Pethalakshmi A (2009) Dimensionality reduction based on rough set theory: a review. Appl Soft Comput 9(1):1–12
Gao BJ, Ester M, Xiong H, Cai JY, Schulte O (2013) The minimum consistent subset cover problem: a minimization view of data mining. IEEE Trans Knowl Data Eng 25(3):690–703
Hall MA, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437–1447
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Qu GZ, Hariri S, Yousif M (2005) A new dependency and correlation analysis for features. IEEE Trans Knowl Data Eng 17(9):1199–1207
Chen WC, Tseng SS, Hong TP (2008) An efficient bit-based feature selection method. Expert Syst Appl 34(4):2858–2869
Wang F, Liang JY, Dang CY (2013) Attribute reduction for dynamic data sets. Appl Soft Comput 13(1):676–689
Yan J, Zhang BY, Liu N, Yan SC, Cheng QS, Fan WG, Yang Q, Xi WS, Chen Z (2006) Effective and efficient dimensionality reduction for large-scale and streaming data preprocessing. IEEE Trans Knowl Data Eng 18(3):320–333
Skowron A (1990) The rough sets theory and evidence theory. Fundam Inf 13:245–262
Øhrn A, Komorowski J, Skowron A, Synak P (1998) The design and implementation of a knowledge discovery toolkit based on rough sets: the ROSETTA system. In: Polkowski L, Skowron A (eds) Rough sets in knowledge discovery. Physica Verlag, Heidelberg, pp 376–399
Jensen R, Shen Q (2007) Rough set based attribute selection: a review. http://cadair.aber.ac.uk/dspace/bitstream/handle/2160/490/JensenShen.pdf?sequence=3. Accessed 12 Dec 2013
Wang J, Wang J (2001) Reduction algorithms based on discernibility matrix: the ordered attributes method. J Comput Sci Technol 16(6):489–504
Mafarja M, Abdullah S (2013) Record-to-record travel algorithm for attribute reduction in rough set theory. J Theor Appl Inf Technol 49(2):507–513
Kahramanli S, Hacibeyoglu M, Arslan A (2011) Attribute reduction by partitioning the minimized discernibility function. Int J Innov Comput Inf Control 7(5A):2167–2186
Procaccia AD, Rosenschein JS (2006) Exact VC-dimension of Monotone formulas. Neural Inf Process Lett Rev 10(7):165–168
Hacibeyoglu M, Basciftci F, Kahramanli S (2011) A logic method for efficient reduction of the space complexity of the attribute reduction problem. Turk J Electr Eng Comput Sci 19(4):643–656
Kahramanli S, Hacibeyoglu M, Arslan A (2011) A Boolean function approach to feature selection in consistent decision information systems. Expert Syst Appl 38(7):8229–8239
Nelson JR (1955) Simplest normal truth functions. J Symb Log 20(2):105–108
Malik AA, Brayton RK, Newton AR, Sangiovannivincentelli A (1991) Reduced offsets for minimization of binary-valued functions. IEEE Trans Comput Aided Des Integr Circuits Syst 10(4):413–426
Miltersen PB, Radhakrishnan J, Wegener I (2005) On converting CNF to DNF. Theor Comput Sci 347(1–2):325–335
Vorwerk K, Paulley GN (2002) On implicate discovery and query optimization. In: Proceedings of international database engineering and applications symposium, 2002
Slagle JR, Chang CL, Lee RCT (1970) New algorithm for generating prime implicants. IEEE Trans Comput C–19(3):304–310
Thelen B (1981) Investigations of algorithms for computer-aided logic design of digital circuits. University of Karlsruhe, Karlsruhe
Bieganowski J, Karatkevich A (2005) Heuristics for Thelen’s prime implicant method. Scheda Inf 14: 125–135
Karatkevich A, Bieganowski J (2004) Detection of deadlocks and traps in petri nets by means of Thelen’s prime implicant method. Int J Appl Math Comput Sci 14(1):113–121
Socher R (1991) Optimizing the clausal normal form transformation. J Autom Reason 7:325–336
Shiny AK, Pujari AK (1998) Computation of prime implicants using matrix and paths. J Logic Comput 8(2):135–145
Lee TT, Lo TY, Wang JF (2006) An information-lossless decomposition theory of relational information systems. IEEE Trans Inf Theory 52(5):1890–1903
Sasao T, Butler JT (2001) Worst and best irredundant sum-of-products expressions. IEEE Trans Comput 50(9):935–948
Machine Learning Repository. http://archive.ics.uci.edu/ml. Accessed 13 Dec 2013
Selvakuberan K, Indradevi M, Rajaman R (2008) Combined feature selection and classification—a novel approach for the categorization of web pages. J Inf Comput Sci 3(2):83–89
Hacibeyoglu M, Arslan A, Kahramanli S (2013) A hybrid method for fast finding the reduct with the best classification accuracy. Adv Electr Comput Eng 13(4):57–64
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hacibeyoglu, M., Salman, M.S., Selek, M. et al. The logic transformations for reducing the complexity of the discernibility function-based attribute reduction problem. Knowl Inf Syst 46, 599–628 (2016). https://doi.org/10.1007/s10115-015-0824-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-015-0824-9