Abstract
Fuzzy-rough set theory is an efficient method for attribute reduction. It can effectively handle the imprecision and uncertainty of the data in the attribute reduction. Despite its efficacy, current approaches to fuzzy-rough attribute reduction are not efficient for the processing of large data sets due to the requirement of higher space complexities. A limited number of accelerators and parallel/distributed approaches have been proposed for fuzzy-rough attribute reduction in large data sets. However, all of these approaches are dependency measure based methods in which fuzzy similarity matrices are used for performing attribute reduction. Alternative discernibility matrix based attribute reduction methods are found to have less space requirements and more amicable to parallelization in building parallel/distributed algorithms. This paper therefore introduces a fuzzy discernibility matrix-based attribute reduction accelerator (DARA) to accelerate the attribute reduction. DARA is used to build a sequential approach and the corresponding parallel/distributed approach for attribute reduction in large data sets. The proposed approaches are compared to the existing state-of-the-art approaches with a systematic experimental analysis to assess computational efficiency. The experimental study, along with theoretical validation, shows that the proposed approaches are effective and perform better than the current approaches.



Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Materials Availability
Data and material are available with the authors.
Code Availability
Code is available with the authors.
References
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356
Yao Y, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178 (17):3356–3373
Zhao S, Chen H, Li C, Du X, Sun H (2015) A novel approach to building a robust fuzzy rough classifier. IEEE Trans Fuzzy Syst 23(4):769–786
Hu Q, Yu D, Xie Z (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn Lett 27(5):414–423
Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838
Dubois D, Prade H (1990) Rough fuzzy sets and fzzy rough sets. Int J Gen Syst 17(2-3):191–209
Radzikowska AM, Kerre EE (2002) A comparative study of fuzzy rough sets. Fuzzy Sets Syst 126(2):137–155
Cornelis C, Cock MD, Radzikowska AM (2008) Fuzzy rough sets: From theory into practice. In: Handbook of granular computing Wiley Ltd, pp 533–552
Ye J, Zhan J, Ding W, Fujita H (2021) A novel fuzzy rough set model with fuzzy neighborhood operators. Inf Sci 544:266–297
Cornelis C, Jensen R, Hurtado G, Ślez̧ak D (2010) Attribute selection with fuzzy decision reducts. Inf Sci 180(2):209–224
Parthaláin NM, Jensen R (2013) Unsupervised fuzzy-rough set-based dimensionality reduction. Inf Sci 229:106–121
Jensen R (2008) Rough set-based feature selection. In: Rough computing. IGI Global, pp 70–107
Wang J, Wang J (2001) Reduction algorithms based on discernibility matrix: The ordered attributes method. J Comput Sci Technol 16(6):489–504
Yao Y, Zhao Y (2009) Discernibility matrix simplification for constructing attribute reducts. Inf Sci 179(7):867–882
Sai Prasad PSVS, Rao CR (2011) Extensions to IQuickReduct. In: Lecture notes in computer science. Springer Berlin, pp 351–362
Janusz A, Ślezak D (2014) Rough set methods for attribute clustering and selection. Appl Artif Intell 28(3):220–242
Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873
Chen Y, Liu K, Song J, Fujita H, Yang X, Qian Y (2020) Attribute group for attribute reduction. Inf Sci 535(5):64–80
Liu K, Yang X, Yu H, Fujita H, Chen X, Liu D (2020) Supervised information granulation strategy for attribute reduction. Int J Mach Learn Cybern, pp 1–15
Dai J, Hu H, Wu W-Z, Qian Y, Huang D (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2174–2187
Wang C, Huang Y, Shao M, Fan X (2019) Fuzzy rough set-based attribute reduction using distance measures. Knowl-Based Syst 164:205–212
Wang C, Qi Y, Shao M, Hu Q, Chen D, Qian Y, Lin Y (2017) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753
Zhang X, Mei C, Chen D, Yang Y (2018) A fuzzy rough set-based feature selection method using representative instances. Knowl-Based Syst 151:216–229
Tan A, Wu W-Z, Qian Y, Liang J, Chen J, Li J (2019) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539
Kumar A, Sai Prasad PSVS (2020) Scalable fuzzy rough set reduct computation using fuzzy min–max neural network preprocessing. IEEE Trans Fuzzy Syst 28(5):953–964
Riza LS, Janusz A, Bergmeir C, Cornelis C, Herrera F, Ślez̧ak D, Benítez JM (2014) Implementing algorithms of rough set theory and fuzzy rough set theory in the r package “RoughSets”. Inf Sci 287:68–89
Sai Prasad PSVS, Rao CR (2014) An efficient approach for fuzzy decision reduct computation. In: Transactions on rough sets XVII. Springer Berlin, pp 82–108
Qian Y, Wang Q, Cheng H, Liang J, Dang C (2015) Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst 258:61–78
Jensen R, Parthaláin NM (2015) Towards scalable fuzzy–rough feature selection. Inf Sci 323:1–15
Ni P, Zhao S, Wang X, Chen H, Li C (2019) PARA: A positive-region based attribute reduction accelerator. Inf Sci 503:533–550
Chen J, Mi J, Lin Y (2020) A graph approach for fuzzy-rough feature selection. Fuzzy Sets Syst 391:96–116
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107
Sowkuntla P, Sai Prasad PSVS (2020) MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme. Knowl-Based Syst 189:105104
Raza MS, Qamar U (2018) A parallel rough set based dependency calculation method for efficient feature selection. Appl Soft Comput 71:1020–1034
Qian J, Miao D, Zhang Z, Yue X (2014) Parallel attribute reduction algorithms using MapReduce. Inf Sci 279:671–690
Sai Prasad PSVS, Subrahmanyam HB, Singh PK (2016) Scalable IQRA_IG algorithm: An iterative MapReduce approach for reduct computation. In: Distributed computing and internet technology. Springer International Publishing, pp 58–69
Singh PK, Sai Prasad PSVS (2016) Scalable quick reduct algorithm: Iterative mapreduce approach. In: Proceedings of the 3rd IKDD conference on data science. 2016. ACM, p 25
Czolombitko M, Stepaniuk J (2016) Attribute reduction based on MapReduce model and discernibility measure. In: Computer information systems and industrial management. Springer International Publishing, pp 55–66
Pavani NL, Sowkuntla P, Rani KS, Sai Prasad PSVS (2019) Fuzzy rough discernibility matrix based feature subset selection with MapReduce. In: TENCON 2019 - 2019 IEEE region 10 conference (TENCON). IEEE, pp 389–394
Bandagar K, Sowkuntla P, Moiz SA, Sai Prasad PSVS (2019) MR_IMQRA: An efficient MapReduce based approach for fuzzy decision reduct computation. In: International conference on pattern recognition and machine intelligence Springer International Publishing, pp 306–316
Kong L, Qu W, Yu J, Zuo H, Chen G, Xiong F, Pan S, Lin S, Qiu M (2020) Distributed feature selection for big data using fuzzy rough sets. IEEE Trans Fuzzy Syst 28(5):846–857
Hu Q, Zhang L, Zhou Y, Pedrycz W (2018) Large-scale multimodality attribute reduction with multi-kernel fuzzy rough sets. IEEE Trans Fuzzy Syst 26(1):226–238
Ding W, Wang J, Wang J (2020) Multigranulation consensus fuzzy-rough based attribute reduction. Knowl-Based Syst, p 105945
Cock MD, Cornelis C, Kerre EE (2007) Fuzzy rough sets: The forgotten step. IEEE Trans Fuzzy Syst 15(1):121–130
Zaharia M, Xin RS, Wendell P, Das T, Armbrust M, Dave A, Meng X, Rosen J, Venkataraman S, Franklin MJ, et al. (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65
Inoubli W, Aridhi S, Mezni H, Maddouri M, Nguifo EM (2018) An experimental survey on big data frameworks. Futur Gener Comput Syst 86:546–564
Jakovits P, Srirama SN (2014) Evaluating mapreduce frameworks for iterative scientific computing applications. In: 2014 International conference on high performance computing & simulation (HPCS). IEEE, pp 226233
(2017) UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets.html
Acknowledgments
Authors are grateful to the reviewers for their valuable comments and suggestions. This work is supported by the Department of Science and Technology (DST), the Government of India under the ICPS project [grant number: DST/ICPS/CPS-Individual/2018/579(G) and DST/ICPS/CPS-Individual/2018/579(C)], and by the Digital India Corporation of the Ministry of Electronics and Information Technology, the Government of India under the Visvesvaraya PhD. scheme with the unique id: MEITY-PHD-1039. We would like to extend our sincere thanks to the authors of the PARA [30] algorithm for providing the source code.
Funding
This work is supported by Department of Science and Technology (DST), Government of India under ICPS project [grant number: DST/ICPS/CPS-Individual/2018/579(G) and DST/ICPS/CPS-Individual/2018/579(C)], and by Digital India Corporation, a Section 8 Company of Ministry of Electronics and Information Technology, Government of India under Visvesvaraya Ph.D. scheme with the unique id: MEITY-PHD-1039.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sowkuntla, P., Prasad, P.S.V.S.S. MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix. Appl Intell 52, 154–173 (2022). https://doi.org/10.1007/s10489-021-02253-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02253-1