Abstract
In this paper, we present an extension of k nearest neighbors method so it can perform imputation/classification from datasets with low quality data. The method performs a weighting of neighbors based on their imperfection and distance of classes. Thus the method allows us explicitly to indicate the average degree of imperfection of the neighbors that it is accepted to carry out the imputation/classification and the average distance of classes to the class of example to impute/classify that it is allowed. We carry out several experiments with both real-world and synthetic datasets with low quality data to test the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bonissone, P.P., Cadenas, J.M., Garrido, M.C., Díaz-Valladares, R.A.: A fuzzy random forest. Int. J. Approximate Reasoning 51(7), 729–747 (2010)
Cadenas, J.M., Garrido, M.C., Martínez, R., Bonissone, P.P.: Extending information processing in a fuzzy random forest. Soft. Comput. 16, 845–861 (2012)
Cadenas, J.M., Garrido, M.C., Martínez-España, R.: Software tool: NIP tool, Universidad de Murcia (2012). http://heurimind.inf.um.es
Derrac, J., García, S., Herrera, F.: Fuzzy nearest neighbor algorithms: taxonomy, experimental analysis and prospects. Inf. Sci. 260, 98–119 (2014)
Diamon, P., Kloeden, P.: Metric Spaces of Fuzzy Sets: Theory and Application. World Scientific, Singapore (1994)
DeLuca, A., Termini, S.: A definition of a nonprobabilistic entropy in the setting of fuzzy sets theory. Inf. Control 20(4), 301–312 (1972)
Dombi, J., Porkolab, L.: Measures fuzziness. Ann. Universitasis Scientiarium Budapestinensis Sect. Computatorica 12, 69–78 (1991)
Dubois, D., Parde, H.: Fuzzy Sets and System, Theory and Applications. Academic Press, New York (1980)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
García, S., Fernández, A., Luengo, J., Herrera, F.: A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft. Comput. 13(10), 959–977 (2009)
Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)
Eickhoff, J.: Introduction to the theory of fuzzy subsets. In: Eickhoff, J. (ed.) Onboard Computers, Onboard Software and Satellite Operations. SAT, vol. 1, pp. 3–6. Springer, Heidelberg (2012)
Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml
Palacios, A.M., Sánchez, L., Couso, I.: Extending a simple genetic cooperative-competitive learning fuzzy classifier to low quality datasets. Evol. Intel. 2, 73–884 (2009)
Ralescu, A.L., Ralescu, D.A.: Probability and fuzziness. Information. Science 34, 85–92 (1984)
Zsolt, C.J., Kovács, S.: Distance based similarity measures of fuzzy sets. In: Proceedings 3rd Symposium on Applied Machine Intelligence (SAMI 2005), Slovakia (2005)
Acknowledgements
Supported by the projects TIN2011-27696-C02-02, TIN2014-52099-R and TIN2014-56381-REDT (“Red de Lógica Difusa y Soft Computing (LODISCO)”) of the Ministry of Economy and Competitiveness of Spain.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Cadenas, J.M., Garrido, M.C., Martínez, R. (2015). Measuring Data Imperfection in a Neighborhood Based Method. In: Puerta, J., et al. Advances in Artificial Intelligence. CAEPIA 2015. Lecture Notes in Computer Science(), vol 9422. Springer, Cham. https://doi.org/10.1007/978-3-319-24598-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-24598-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24597-3
Online ISBN: 978-3-319-24598-0
eBook Packages: Computer ScienceComputer Science (R0)