A Wrapper Method for Feature Selection in Multiple Classes Datasets

Sánchez-Maroño, Noelia; Alonso-Betanzos, Amparo; Calvo-Estévez, Rosa M.

doi:10.1007/978-3-642-02478-8_57

Noelia Sánchez-Maroño²⁰,
Amparo Alonso-Betanzos²⁰ &
Rosa M. Calvo-Estévez²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5517))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

1762 Accesses
10 Citations

Abstract

Feature selection algorithms should remove irrelevant and redundant features while maintaining or even improving performance, and thus contributing to enhance generalization in learning models. Feature selection methods can be mainly grouped into filters and wrappers. Most of the models built can deal more or less adequately with binary problems, but often under perform on multi-class tasks. In this article, a new wrapper method, called IAFN-FS (Incremental ANOVA and Functional Networks-Feature Selection) is described in its version for dealing with multiclass problems. In order to carry out the multiclass approach, two different alternatives were tried: (a) treating directly the multiclass problem, (b) dividing the original multiclass problem in several binary problems. In order to evaluate the performance of both approaches, a comparative study using several benchmark datasets, our two methods and other wrappers based in classical algorithms, such as C4.5 and Naive-Bayes, was carried out.

This work has been funded in part by Project TIN2006-02402 of the Ministerio de Educación y Ciencia, Spain (partially supported by the European Union ERDF).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence, Special issue on relevance 97(1-2), 273–324 (1997)
Article MATH Google Scholar
Chiblovskii, B., Lecerf, L.: Scalable feature selection for multiclass problems. In: Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2008, pp. 227–240 (2008)
Google Scholar
Bruzzone, L., Serpico, S.: A technique for feature selection in multiclass problems. Int. J. of Remote Sensing 21(3), 549–563 (2000)
Article Google Scholar
Bosin, A., Dessí, N., Pes, B.: Capturing heuristic and intelligents methods for improvinf micro-array data classification. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS, vol. 4881, pp. 790–799. Springer, Heidelberg (2007)
Chapter Google Scholar
Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection based on sensitivity analysis. In: Borrajo, D., Castillo, L., Corchado, J.M. (eds.) CAEPIA 2007. LNCS, vol. 4788, pp. 239–248. Springer, Heidelberg (2007)
Chapter Google Scholar
Castillo, E., Cobo, A., Gutiérrez, M., Pruneda, E.: Functional Networks with Applications. Kluwer Academic Publishers, Dordrecht (1998)
MATH Google Scholar
Sobol, I.M.: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation 55, 271–280 (2001)
Article MathSciNet MATH Google Scholar
Castillo, E., Sánchez-Maroño, N., Alonso-Betanzos, A., Castillo, M.: Functional network topology learning and sensitivity analysis based on anova decomposition. Neural Computation 19(1) (2007)
Google Scholar
Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 1289–1305 (2003)
Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature extraction. Foundations and applications. Springer, Heidelberg (2006)
Book MATH Google Scholar
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output code. Journal of Artificial Intelligence Resarch 2, 263–285 (1995)
MATH Google Scholar
Holmes, G., Pfahringer, B., Kirkby, R., Frank, E., Hall, M.: Multiclass alternating decision trees. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS, vol. 2430, p. 161. Springer, Heidelberg (2002)
Chapter Google Scholar
Friedman, J.: Additive logistic regresion: a statistical view of boosting. The Annals of Statistic 28(2), 337–374 (2000)
Article MathSciNet MATH Google Scholar
Oza, N., Tumer, K.: Input decimation ensembles: decorrelation through dimensionality reduction. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, p. 238. Springer, Heidelberg (2001)
Chapter Google Scholar
Jaopkowicz, N., Stephen, S.: The class imbalance problem: A system study. Intelligent Data Analysis 6(5) (2002)
Google Scholar
Witten, H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of A Coruña, 15071, A Coruña, Spain
Noelia Sánchez-Maroño, Amparo Alonso-Betanzos & Rosa M. Calvo-Estévez

Authors

Noelia Sánchez-Maroño
View author publications
You can also search for this author in PubMed Google Scholar
Amparo Alonso-Betanzos
View author publications
You can also search for this author in PubMed Google Scholar
Rosa M. Calvo-Estévez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Ingeniería Electrónica, Universitat Politècnica de Catalunya (UPC). E.T.S.I. de Telecomunicación., , , ,, Campus Norte, Edificio C4, C/ Jordi Girona, 1-3, E08034, Barcelona, Spain
Joan Cabestany
Grupo ISIS, Dpto. Tecnología Electrónica ETSI Telecomunicación, Universidad de Málaga, Campus de Teatinos, 29071, Málaga, Spain
Francisco Sandoval
Department of Computer Architecture and Computer Technology, University of Granada, Spain
Alberto Prieto
Department of Informatics, University of Salamanca, Salamanca, Spain
Juan M. Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sánchez-Maroño, N., Alonso-Betanzos, A., Calvo-Estévez, R.M. (2009). A Wrapper Method for Feature Selection in Multiple Classes Datasets. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds) Bio-Inspired Systems: Computational and Ambient Intelligence. IWANN 2009. Lecture Notes in Computer Science, vol 5517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02478-8_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-02478-8_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02477-1
Online ISBN: 978-3-642-02478-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics