Skip to main content

Applying Cost Sensitive Feature Selection in an Electric Database

  • Conference paper
Foundations of Intelligent Systems (ISMIS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4994))

Included in the following conference series:

Abstract

Feature selection is a crucial activity when knowledge discovery is applied to large databases, as it reduces dimensionality and therefore the complexity of the problem. Its main objective is to eliminate attributes to obtain a computationally tractable problem, without affecting the solution quality. To perform feature selection, several methods have been proposed, some of them tested over small academic datasets. In this paper we evaluate different feature selection-ranking methods over a large real world database related with a Mexican electric energy client-invoice system. Most of the research on feature selection methods only evaluates accuracy and processing time; here we also report on cost sensitive classification and the amount of discovered knowledge. Additionally, we stress the issue around the boundary that separates relevant and irrelevant features. Finally, we propose a promising feature selection heuristic based on the experiments performed, taken into account a cost sensitive classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Frawley, W., et al.: Knowledge Discovery in DBs: An Overview. In: Piatetsky-Shapiro, G. (ed.) Knowledge Discovery in Databases, pp. 1–27. AAAI/MIT, Cambridge (1991)

    Google Scholar 

  2. Pyle, D.: Data preparation for data mining. Morgan Kaufmann, San Francisco, California (1999)

    Google Scholar 

  3. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of machine learning research 3, 1157–1182 (2003)

    Article  MATH  Google Scholar 

  4. Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence Journal, Special issue on relevance, 273–324 (1997)

    Google Scholar 

  5. Leite, R., Brazdil, P.: Decision tree-based attribute selection via sub sampling. In: Herrera, F., Riquelme, J. (eds.) Workshop de minería de datos y aprendizaje, VIII Iberamia, Sevilla, Spain, November 2002, pp. 77–83 (2002)

    Google Scholar 

  6. Piramuthu, S.: Evaluating feature selection methods for learning in data mining applications. In: Proc. 31st annual Hawaii Int. conf. on system sciences, pp. 294–301 (1998)

    Google Scholar 

  7. Stolfo, S., Fan, W., Lee, W., Prodromidis, A., Chan, P.: Credit card fraud detection using meta-learning: Issues and initial results. In: Working notes of AAAI Workshop on AI Approaches to Fraud Detection and Risk Management (1997)

    Google Scholar 

  8. (2003), http://www.ia.uned.es/~elvira/

  9. (2003), www.cs.waikato.ac.nz/ml/weka

  10. Stoppiglia, H., Dreyfus, G., et al.: Ranking a random feature for variable and feature selection. Journal of machine learning research 3, 1399–1414 (2003)

    Article  MATH  Google Scholar 

  11. Molina, L., Belanche, L., Nebot, A.: Feature selection algorithms, a survey and experimental evaluation. In: IEEE Int. conf. on data mining, Maebashi City Japan, pp. 306–313 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Aijun An Stan Matwin Zbigniew W. Raś Dominik Ślęzak

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mejía-Lavalle, M. (2008). Applying Cost Sensitive Feature Selection in an Electric Database. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds) Foundations of Intelligent Systems. ISMIS 2008. Lecture Notes in Computer Science(), vol 4994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68123-6_71

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68123-6_71

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68122-9

  • Online ISBN: 978-3-540-68123-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics