Abstract
Presence of missing values and irrelevant features are commonplace issues that need to be handled effectively. Missing value imputation and feature selection is an efficient technique for redressing such problems. Fuzzy rough set-based approaches provide a handful of solutions for further dealing with vagueness and uncertainty available in the data. The present paper introduces the notion of imputing missing values followed by feature selection utilizing fuzzy rough set-based approaches. The idea of missing value estimation and instance ignorance are combined for fuzzy rough missing value imputation employing only correlated features followed by feature selection with a search heuristic. The experimental evaluation on benchmark datasets demonstrates the applicability and robustness of the proposed work. It significantly reduces data dimensionality after imputing missing values maintaining high performances. A comparative analysis demonstrates the superiority of the proposed methodology.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
All data analysed during this study are from UCI [30], open ML [31] and open MV net (https://openmv.net/tag/missing-data).
References
Gupta A, Lam MS (1996) Estimating missing values using neural networks. J Oper Res Soc 47(2):229–238
Song S, Sun Y, Zhang A, Chen L, Wang J (2018) Enriching data imputation under similarity rule constraints. IEEE transactions on knowledge and data engineering
Honghai F, Guoshun C, Cheng Y, Bingru Y, Yumei C (2005) A svm regression based approach to filling in missing values. In: International Conference on Knowledge-Based and Intelligent Information and Engineering Systems. Springer, pp 581–587
Liao Z, Lu X, Yang T, Wang H (2009) Missing data imputation: a fuzzy k-means clustering algorithm over sliding window. In: 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 3. IEEE, pp 133–137
de França FO, Coelho GP, Von Zuben FJ (2013) Predicting missing values with biclustering: A coherence-based approach. Pattern Recogn 46(5):1255–1266
Liu Z-G, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recogn 52:85–95
Sefidian AM, Daneshpour N (2019) Missing value imputation using a novel grey based fuzzy c-means, mutual information based feature selection, and regression model. Expert Syst Appl 115:68–94
Rastegar S, Araujo R, Mendes J (2017) Online identification of takagi-sugeno fuzzy models based on self-adaptive hierarchical particle swarm optimization algorithm. Appl Math Model 45:606–620
Silva-Ramirez E-L, Cabrera-Sánchez J-F (2022) Correction to: Co-active neuro-fuzzy inference system model as single imputation approach for non-monotone pattern of missing data. Neural Comput Appl 34(3):2495–2496
Shu W, Shen H (2014) Incremental feature selection based on rough set in dynamic incomplete data. Pattern Recogn 47(12):3890–3906
Safi M (2021) Data imputation using differential dependency and fuzzy multi-objective linear programming, Ph.D. thesis, University of Windsor (Canada)
Choudhury SJ, Pal NR (2022) Fuzzy clustering of single-view incomplete data using a multi-view framework. IEEE Trans Fuzzy Syst
Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Intelligent Decision Support. Springer, pp 203–232
Raja P, Sasirekha K, Thangavel K (2019) A novel fuzzy rough clustering parameter-based missing value imputation. Neural Comput Appl pp 1–18
Jain P, Tiwari AK, Som T (2020) A fitting model based intuitionistic fuzzy rough feature selection. Eng Appl Artif Intell 89:103421
Jain P, Tiwari AK, Som T (2022) An intuitionistic fuzzy bireduct model and its application to cancer treatment. Comput Ind Eng 168:108124
Jain P, Tiwari AK, Som T (2021) Enhanced prediction of anti-tubercular peptides from sequence information using divergence measure-based intuitionistic fuzzy-rough feature selection. Soft Comput 25(4):3065–3086
Huang Z, Li J (2022) Noise-tolerant discrimination indexes for fuzzy v covering and feature subset selection. IEEE Tran Neural Netw Learn Syst
Zhang X, Mei C, Chen D, Li J (2016) Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15
Li Y, Wu Z-F (2008) Fuzzy feature selection based on min-max learning rule and extension matrix. Pattern Recogn 41(1):217–226
Yuan Z, Chen H, Li T (2022) Exploring interactive attribute reduction via fuzzy complementary entropy for unlabeled mixed data. Pattern Recogn 127:108651
Wan J, Chen H, Li T, Sang B, Yuan Z (2022) Feature grouping and selection with graph theory in robust fuzzy rough approximation space. IEEE Trans Fuzzy Syst
Qiu Z, Zhao H (2022) A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl Intell 1–14
Dengfeng L, Chuntian C (2002) New similarity measures of intuitionistic fuzzy sets and application to pattern recognitions. Pattern Recogn Lett 23(1–3):221–225
Radzikowska AM, Kerre EE (2002) A comparative study of fuzzy rough sets. Fuzzy Sets Syst 126(2):137–155
Jensen R, Mac Parthaláin N, Cornells C (2014) Feature grouping-based fuzzy-rough feature selection. In: 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, pp 1488–1495
Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer Science & Business Media, Berlin
Wang G-G, Deb S, Zhao X, Cui Z (2018) A new monarch butterfly optimization with an improved crossover operator. Oper Res 18(3):731–755
Wang G-G, Zhao X, Deb S (2015) A novel monarch butterfly optimization with greedy strategy and self-adaptive. In: 2015 Second International Conference on Soft Computing and Machine Intelligence (ISCMI). IEEE, pp 45–50
Asuncion A, Newman D (2007) Uci machine learning repository
Vanschoren J, Van Rijn JN, Bischl B, Torgo L (2014) Openml: networked science in machine learning. ACM SIGKDD Explor Newsl 15(2):49–60
Singh S, Haddon J, Markou M (2001) Nearest-neighbour classifiers in natural scene analysis. Pattern Recogn 34(8):1601–1612
Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to platt’s smo algorithm for svm classifier design. Neural comput 13(3):637–649
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17
Maini T, Kumar A, Misra RK, Singh D, Intelligent fuzzy rough set based feature selection using swarm algorithms with improved initialization, J Intell Fuzzy Syst (Preprint) 1–10
Wang C, Qi Y, Shao M, Hu Q, Chen D, Qian Y, Lin Y (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753
Acknowledgements
This research work is funded by UGC Research Fellowship, India (Grant no: 3600/(PWD)(NET-NOV2017)) awarded to first author.
Funding
This study was funded by UGC Research Fellowship, India.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jain, P., Tiwari, A. & Som, T. Fuzzy rough assisted missing value imputation and feature selection. Neural Comput & Applic 35, 2773–2793 (2023). https://doi.org/10.1007/s00521-022-07754-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07754-9