Abstract
Data imputation aims to solve missing values problem which is common in nowadays applications. Many techniques have been proposed to solve this problem from statistical methods such as Mean/Mode to machine learning models. In this paper, an approach based on Co-active Neuro-Fuzzy Inference System named CANFIS-ART is proposed to automate data imputation procedure. This model is constructed from the Neural Network adaptative capabilities and fuzzy logic qualitative approach using the Fuzzy-ART algorithm. Performance of CANFIS-ART model is compared to other state-of-the-art imputation techniques such as Multilayer Perceptron or Hot-Deck, among others, using a total of eighteen databases exposed to a perturbation procedure based on the random generation of non-monotone missing values pattern. The data sets cover a wide range of fields, types of variables and sizes. A comparison of databases imputed by these models using a set of three classifiers has been conducted. A statistical analysis of these results employing Wilcoxon signed-ranked test has been included. Experiments show that CANFIS-ART approach not only outperforms these state-of-the-art techniques but also demonstrates a higher level of generalization capability, increasing the data quality contained in databases with missing values.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
29 October 2021
A Correction to this paper has been published: https://doi.org/10.1007/s00521-021-06623-1
References
Abraham A (2005) Adaptation of fuzzy inference system using neural learning, vol 181. Springer, Berlin, pp 53–83. https://doi.org/10.1007/11339366_3
Andridge R, Little R (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78(1):40–64. https://doi.org/10.1111/j.1751-5823.2010.00103.x
Aquino G, Rubio J, Pacheco J, Gutierrez G, Ochoa G, Balcazar R, Cruz D, García E, Novoa J, Zacarías A (2020) Novel nonlinear hypothesis for the delta parallel robot modeling. IEEE Access 8(1):46324–46334
Aydilek I, Arslan A (2013) A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf Sci 233:25–35. https://doi.org/10.1016/j.ins.2013.01.021
Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Blej M, Azizi M (2016) Comparison of mamdani-type and sugeno-type fuzzy inference systems for fuzzy real time scheduling. Int J Appl Eng Res 11(22):11071–11075
Blend D, Marwala T (2008) Comparison of data imputation techniques and their impact. https://arxiv.org/abs/0812.1539
Buckley J, Eslami E (1996) Fuzzy neural networks: capabilities. Springer, Boston, pp 167–183. https://doi.org/10.1007/978-1-4613-1365-6_8
Carpenter G, Grossberg S, Rosen B (1991) Fuzzy art: fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Netw 4:759–771
Dastorani M, Moghadamnia A, Piri J, Rico-Ramírez M (2010) Application of ANN and ANFIS models for reconstructing missing flow data. Environ Monit Assess 166(1–4):421–434
Demuth H, Beale M (1997) Neural Network TOOLBOX for Use with Matlab. The Math Works Inc, User’s Guide http://www.mathworks.com
Ding Y, Simonoff J (2010) An investigation of missing data methods for classification trees applied to binary response data. J Mach Learn Res 11:131–170
Duan Y, Lv Y, Kang W, Zhao Y (2014) A deep learning based approach for traffic data imputation. In: 17th International IEEE conference on intelligent transportation systems (ITSC), pp 912–917. https://doi.org/10.1109/ITSC.2014.6957805
Enders C, Bandalos D (2001) The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Struct Equ Model Multidiscip J 8(3):430–457. https://doi.org/10.1207/S15328007SEM0803_5
Euredit (2005) Interim report on evaluation criteria for statistical editing and imputation http://www.cs.york.ac.uk/euredit
Fessant F, Midenet S (2002) Self-organising map for data imputation and correction in surveys. Neural Comput Appl 10(4):300–310
Frank A, Asuncion A (2018) UCI machine learning repository. http://archive.ics.uci.edu/ml
García-Laencina P, Sancho-Gómez J, Figueiras-Vidal A, Verleysen M (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282. https://doi.org/10.1007/s00521-009-0295-6
Gower J (1971) A general coefficient of similarity and some of its properties. Biometrics 27(4):857–871
Hocaoglu F, Kurban M (2007) The effect of missing wind speed data on wind power estimation. In: International conference on intelligent data engineering and automated learning, Springer, pp 107–114
Hocaoglu F, Oysal Y, Kurban M (2009) Missing wind data forecasting with adaptive neuro-fuzzy inference system. Neural Comput Appl 18(3):207–212
Jang J (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Systems Man Cybern 23(3):665–685. https://doi.org/10.1109/21.256541
Jang J, Sun C, Mizutani E (1997) Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice Hall, Upper Saddle River
Jerez J, Molina I, García-Laencina P, Alba E, Ribelles N, Martín M, Franco L (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med 50(2):105–115. https://doi.org/10.1016/j.artmed.2010.05.002
Jiang Y, Zhou Z (2004) Editing training data for knn classifiers with neural network ensemble. In: Lecture notes in computer science, vol 3173, Springer, pp 356–361
Kaur A, Kaur A (2012) Comparison of mamdani-type and sugeno-type fuzzy inference systems for air conditioning system. Int J Soft Comput Eng 2(2):323–325
Koikkalainen P (2002) Neural networks for editing and imputation. In: DataClean 2002 conference, Jyväskylä (Finland)
Konsoulas I (2014) Adaptive neuro-fuzzy inference systems (anfis) library for simulink
Kuppusamy V, Paramasivam I (2017) Grey fuzzy neural network-based hybrid model for missing data imputation in mixed database. Int J Intell Eng Syst 10(2):146–155. https://doi.org/10.22266/ijies2017.0430.16
Little R, Rubin D (1987) Statistical analysis with missing data. Wiley, New York
Mamdani E, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man-Mach Stud 7(1):1–13. https://doi.org/10.1016/S0020-7373(75)80002-2
Meda J (2018) On the estimation and control of nonlinear systems with parametric uncertainties and noisy outputs. IEEE Access 6(1):31968–31973
Mitchell T (1997) Machine Learning. Computer Science Series, McGraw-Hill International Editions
Nordbotten S (1996) Neural network imputation applied to the norwegian 1990 population census data. J Off Stat 12(4):385–401
Parthiban L, Subramanian R (2007) Intelligent heart disease prediction system using canfis and genetic algorithm. Int J Med Health Sci 1(5)
Rubin D (1976) Inference and missing data. Biometrika 63(3):581–592
Rubio J (2009) SOFMLS: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst fuzzy Syst 17(6):1296–1309
Rubio J, García E, Ochoa G, Elías I, Cruz D, Balcazar R, López J, Novo J (2019) Unscented kalman filter for learning of a solar dryer and a greenhouse. J Intell Fuzzy Syst 37(5):6731–6741
Sánchez-Morales A, Sancho-Gómez J, Martínez-García J, Figueiras-Vidal A (2019) Improving deep learning performance with missing values via deletion and compensation. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04013-2
Sarle W (2002) Neural network FAQ. Periodic posting to the usenet newsgroup comp.ai.neural-nets
Silva-Ramírez E, Pino-Mejías R, López-Coello M, Cubiles-de-la-Vega M (2011) Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw 24(1):121–129. https://doi.org/10.1016/j.neunet.2010.09.008
Silva-Ramírez E, Pino-Mejías R, López-Coello M (2015) Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbours for monotone patterns. Appl Soft Comput J 29:65–74. https://doi.org/10.1016/j.asoc.2014.09.052
Silva-Ramírez E, López-Coello M, Pino-Mejías R (2018) An application sample of machine learning tools, such as SVM and ANN, for data editing and imputation, vol 29. Springer, Berlin, pp 259–298. https://doi.org/10.1007/978-3-319-62359-7_13
Song X, Fan G, Rao M (2008) SVM-Based data editing for enhanced one-class classification of remotely sensed imagery. IEEE Geosci Remote Sens Lett 5(2)
Sonnberger H, Maine N (2000) Editing and imputation in Eurostat. In: Working Paper N\(^o\)21, UN/ECE Work session on statistical data editing. Conference of European Statisticians, Cardiff (United Kingdom)
Sugeno M, Tong R (1985) Industrial applications of fuzzy control, vol 44. Elsevier, Amsterdam
Tfwala S, Wang Y (2013) Lin Y (2013) Prediction of missing flow records using multilayer perceptron and coactive neurofuzzy inference system. Sci World J
Turabieh H, Mafarja M, Mirjalili S (2019) Dynamic adaptive network-based fuzzy inference system (d-anfis) for the imputation of missing data for internet of medical things applications. IEEE Internet of Things J. https://doi.org/10.1109/JIOT.2019.2926321
Wang L (1997) A course in fuzzy systems and control. Prentice-Hall Inc, Upper Saddle River
Yang Z, Liu Y, Li C (2011) Interpolation of missing wind data based on anfis. Renew Energy 36(3):993–998
Yeom C, Kwak K (2018) Performance comparison of anfis models by input space partitioning methods. Symmetry 10(12):1–25. https://doi.org/10.3390/sym10120700
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Silva-Ramírez, EL., Cabrera-Sánchez, JF. Co-active neuro-fuzzy inference system model as single imputation approach for non-monotone pattern of missing data. Neural Comput & Applic 33, 8981–9004 (2021). https://doi.org/10.1007/s00521-020-05661-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05661-5