Abstract
Feature selection algorithms should remove irrelevant and redundant features while maintaining or even improving performance, and thus contributing to enhance generalization in learning models. Feature selection methods can be mainly grouped into filters and wrappers. Most of the models built can deal more or less adequately with binary problems, but often under perform on multi-class tasks. In this article, a new wrapper method, called IAFN-FS (Incremental ANOVA and Functional Networks-Feature Selection) is described in its version for dealing with multiclass problems. In order to carry out the multiclass approach, two different alternatives were tried: (a) treating directly the multiclass problem, (b) dividing the original multiclass problem in several binary problems. In order to evaluate the performance of both approaches, a comparative study using several benchmark datasets, our two methods and other wrappers based in classical algorithms, such as C4.5 and Naive-Bayes, was carried out.
This work has been funded in part by Project TIN2006-02402 of the Ministerio de Educación y Ciencia, Spain (partially supported by the European Union ERDF).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence, Special issue on relevance 97(1-2), 273–324 (1997)
Chiblovskii, B., Lecerf, L.: Scalable feature selection for multiclass problems. In: Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2008, pp. 227–240 (2008)
Bruzzone, L., Serpico, S.: A technique for feature selection in multiclass problems. Int. J. of Remote Sensing 21(3), 549–563 (2000)
Bosin, A., Dessí, N., Pes, B.: Capturing heuristic and intelligents methods for improvinf micro-array data classification. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS, vol. 4881, pp. 790–799. Springer, Heidelberg (2007)
Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection based on sensitivity analysis. In: Borrajo, D., Castillo, L., Corchado, J.M. (eds.) CAEPIA 2007. LNCS, vol. 4788, pp. 239–248. Springer, Heidelberg (2007)
Castillo, E., Cobo, A., Gutiérrez, M., Pruneda, E.: Functional Networks with Applications. Kluwer Academic Publishers, Dordrecht (1998)
Sobol, I.M.: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation 55, 271–280 (2001)
Castillo, E., Sánchez-Maroño, N., Alonso-Betanzos, A., Castillo, M.: Functional network topology learning and sensitivity analysis based on anova decomposition. Neural Computation 19(1) (2007)
Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 1289–1305 (2003)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature extraction. Foundations and applications. Springer, Heidelberg (2006)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output code. Journal of Artificial Intelligence Resarch 2, 263–285 (1995)
Holmes, G., Pfahringer, B., Kirkby, R., Frank, E., Hall, M.: Multiclass alternating decision trees. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS, vol. 2430, p. 161. Springer, Heidelberg (2002)
Friedman, J.: Additive logistic regresion: a statistical view of boosting. The Annals of Statistic 28(2), 337–374 (2000)
Oza, N., Tumer, K.: Input decimation ensembles: decorrelation through dimensionality reduction. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, p. 238. Springer, Heidelberg (2001)
Jaopkowicz, N., Stephen, S.: The class imbalance problem: A system study. Intelligent Data Analysis 6(5) (2002)
Witten, H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sánchez-Maroño, N., Alonso-Betanzos, A., Calvo-Estévez, R.M. (2009). A Wrapper Method for Feature Selection in Multiple Classes Datasets. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds) Bio-Inspired Systems: Computational and Ambient Intelligence. IWANN 2009. Lecture Notes in Computer Science, vol 5517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02478-8_57
Download citation
DOI: https://doi.org/10.1007/978-3-642-02478-8_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02477-1
Online ISBN: 978-3-642-02478-8
eBook Packages: Computer ScienceComputer Science (R0)