Skip to main content

Split Criterions for Variable Selection Using Decision Trees

  • Conference paper
Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4724))

Abstract

In the field of attribute mining, several feature selection methods have recently appeared indicating that the use of sets of decision trees learnt from a data set can be an useful tool for selecting relevant and informative variables regarding to a main class variable. With this aim, in this study, we claim that the use of a new split criterion to build decision trees outperforms another classic split criterions for variable selection purposes. We present an experimental study on a wide and different set of databases using only one decision tree with each split criterion to select variables for the Naive Bayes classifier.

This work has been supported by the Spanish Ministry of Science and Technology under the projects TIN2005-02516 and TIN2004-06204-C03-02 and FPU scholarship programme (AP2004-4678).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abellán, J.: Uncertainty measures on probability intervals from Imprecise Dirichlet model. Int. J. of General Systems 35(5), 509–528 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Abellán, J., Moral, S.: Maximum entropy for credal sets. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 11, 587–597 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  3. Abellán, J., Moral, S.: Building classification trees using the total uncertainty criterion. Int. J. of Intelligent Systems 18(12), 1215–1225 (2003)

    Article  MATH  Google Scholar 

  4. Abellán, J., Moral, S.: An algorithm that computes the upper entropy for order-2 capacities. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 14(2), 141–154 (2005)

    Article  MATH  Google Scholar 

  5. Abellán, J., Moral, S.: Upper entropy of credal sets. Applications to credal classification. Int. J. of Approximate Reasoning 39(2-3), 235–255 (2005)

    Article  MATH  Google Scholar 

  6. Abellán, J., Klir, G.J., Moral, S.: Disaggregated total uncertainty measure for credal sets. Int. J. of General Systems 35(1), 29–44 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bernard, J.M.: An introduction to the imprecise Dirichlet model for multinomial data. Int. J. of Approximate Reasoning 39, 123–150 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, Belmont. Wadsworth Statistics, Probability Series, Belmont (1984)

    Google Scholar 

  9. Duda, R.O., Hart, P.E.: Pattern classification and scene analysis. John Wiley and Sons, New York (1973)

    MATH  Google Scholar 

  10. Fayyad, U.M., Irani, K.B.: Multi-valued interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  11. Hall, M.A., Holmes, G.: Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. IEEE TKDE 15(3), 1–16 (2003)

    Google Scholar 

  12. Klir, G.J.: Uncertainty and Information: Foundations of Generalized Information Theory. John Wiley, Chichester (2006)

    MATH  Google Scholar 

  13. Lau, M., Schultz, M.: A Feature Selection Method for Gene Expression Data with Thousands of Features. Technical Report,CS Department, Yale University (2003)

    Google Scholar 

  14. Li, J., Liu, H., Ng, S.K., Wong, L.: Discovery of significant rules for classifying cancer diagnosis data. Bioinformatics 19(2), 93–102 (2003)

    Google Scholar 

  15. Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning (2001)

    Google Scholar 

  16. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  17. Quinlan, J.R.: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning (1993)

    Google Scholar 

  18. Ratanamahatana, C., Gunopulos, D.: Feature selection for the naive bayesian classifier using decission trees. App. Art. Intelligence 17, 475–487 (2003)

    Article  Google Scholar 

  19. Shannon, C.E.: A mathematical theory of communication. The Bell System Technical Journal 27, 379–423, 623–656 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  20. Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London (1991)

    Book  MATH  Google Scholar 

  21. Walley, P.: Inferences from multinomial data: learning about a bag of marbles. J. Roy. Statist. Soc. B 58, 3–57 (1996)

    MathSciNet  MATH  Google Scholar 

  22. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abellán, J., Masegosa, A.R. (2007). Split Criterions for Variable Selection Using Decision Trees. In: Mellouli, K. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2007. Lecture Notes in Computer Science(), vol 4724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75256-1_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75256-1_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75255-4

  • Online ISBN: 978-3-540-75256-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics