Abstract
Feature selection serves for both reduction of the total amount of available data (removing of valueless data) and improvement of the whole behavior of a given induction algorithm (removing data that cause deterioration of the results). A method of proper selection of features for an inductive algorithm is discussed. The main idea consists in proper descending ordering of features according to a measure of new information contributing to previous valuable set of features. The measure is based on comparing of statistical distributions of individual features including mutual correlation. A mathematical theory of the approach is described. Results of the method applied to real-life data are shown.
The work was supported by Ministry of Education of the Czech Rep. under project LN00B096.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Csörgö, S., Faraway, J.J.: The Exact and Asymptotic Distributions of Cramér-von Mises Statistics. J.R. Statist. Soc. B 58(1), 221–234 (1996)
Dong, M., Kothari, R.: Feature subset selection using a new definition of classifiability. Pattern Recognition Letters 24, 1215–1225 (2003)
Hátle, J., Likeš, J.: Basics in probability and mathematical statistics (in Czech), SNTL/ALFA Praha (1974)
Jirina, M., Jirina Jr., M.: Feature Selection By Reordering according to their Weights. Technical Report No. 919, Institute of Computer Science AS CR, p. 9 (November 2004), http://www.cs.cas.cz/people/homepages/jirina_marcel.shtml
John, J.K., Kohavi, R., Pfleger, K.: Irrelevant features and the Subset Selection problem. In: Cohen, W., Hirsh, H. (eds.) Machine Learning: Proc. of the Eleventh Int. Conf., pp. 121–129. Morgan Kaufmann Publishers, San Francisco (1994)
Koller, D., Sahami, M.: Toward Optimal Feature Selection. In: Proc. of the Thirteenth Int. Conf. on Machine Learning, pp. 284–292. Morgan Kaufmann Publishers, San Francisco (1996)
Last, M., Kandel, A., Maimon, O.: Information-theoretic algorithm for feature selection. Pattern Recognition Letters 22, 799–811 (2001)
Smirnov, N.: Table for estimating the goodness of fit of empirical distributions. Annals of Math. Statist. 19, 279–281 (1948)
UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jirina, M., Jirina, M. (2005). Feature Selection by Reordering. In: Vojtáš, P., Bieliková, M., Charron-Bost, B., Sýkora, O. (eds) SOFSEM 2005: Theory and Practice of Computer Science. SOFSEM 2005. Lecture Notes in Computer Science, vol 3381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30577-4_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-30577-4_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24302-1
Online ISBN: 978-3-540-30577-4
eBook Packages: Computer ScienceComputer Science (R0)