An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise

Abellán, Joaquín; Masegosa, Andrés R.

doi:10.1007/978-3-642-02906-6_39

Joaquín Abellán²¹ &
Andrés R. Masegosa²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5590))

Included in the following conference series:

European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty

1326 Accesses
8 Citations

Abstract

Decision trees are simple structures used in supervised classification learning. The results of the application of decision trees in classification can be notably improved using ensemble methods such as Bagging, Boosting or Randomization, largely used in the literature. Bagging outperforms Boosting and Randomization in situations with classification noise. In this paper, we present an experimental study of the use of different simple decision tree methods for bagging ensemble in supervised classification, proving that simple credal decision trees (based on imprecise probabilities and uncertainty measures) outperforms the use of classical decision tree methods for this type of procedure when they are applied on datasets with classification noise.

This work has been jointly supported by the Spanish Ministry of Education and Science under project TIN2007-67418-C03-03 and by European Regional Development Fund (FEDER); and FPU scholarship programme (AP2004-4678).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abellán, J.: Uncertainty measures on probability intervals from Imprecise Dirichlet model. Int. J. General Systems 35(5), 509–528 (2006)
Article MATH Google Scholar
Abellán, J., Moral, S.: Maximum entropy for credal sets. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 11(5), 587–597 (2003)
Article MATH Google Scholar
Abellán, J., Moral, S.: Building classification trees using the total uncertainty criterion. Int. J. of Intelligent Systems 18(12), 1215–1225 (2003)
Article MATH Google Scholar
Abellán, J., Moral, S.: An algorithm that computes the upper entropy for order-2 capacities. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 14(2), 141–154 (2006)
Article MATH Google Scholar
Abellán, J., Moral, S.: Upper entropy of credal sets. Applications to credal classification. Int. J. of Approximate Reasoning 39(2-3), 235–255 (2005)
Article MATH Google Scholar
Abellán, J., Klir, G.J., Moral, S.: Disaggregated total uncertainty measure for credal sets. Int. J. of General Systems 35(1), 29–44 (2006)
Article MATH Google Scholar
Bernard, J.M.: An introduction to the imprecise Dirichlet model for multinomial data. Int. J. of Approximate Reasoning 39, 123–150 (2005)
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone. C.J.: Classification and Regression Trees. Wadsworth Statistics, Probability Series, Belmont (1984)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Demsar, J.: Statistical Comparison of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7, 1–30 (2006)
MATH Google Scholar
Dietterich, T.G.: An Experimental Comparison of Three Methods for Constucting Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning 40, 139–157 (2000)
Article Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-valued interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Thirteenth International Conference on Machine Learning, San Francisco, pp. 148–156 (1996)
Google Scholar
Friedman, M.: The use of rank to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32, 675–701 (1937)
Article Google Scholar
Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Annals of Mathematical Statistics 11, 86–92 (1940)
Article MATH Google Scholar
Nemenyi, P.B.: Distribution-free multiple comparison. PhD thesis, Princenton University (1963)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Quinlan, J.R.: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning (1993)
Google Scholar
Salzberg, S.L.: On comparison classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1, 317–328 (1997)
Article Google Scholar
Shannon, C.E.: A mathematical theory of communication. The Bell System Technical Journal 27, 379–423, 623–656 (1948)
Article MATH Google Scholar
Sheskin, D.J.: Handbook of parametric and nonparametric statistical procedures. Chapman & Hall/CRC, Boca Raton (2000)
MATH Google Scholar
Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London (1991)
Book MATH Google Scholar
Walley, P.: Inferences from multinomial data: learning about a bag of marbles. J. Roy. Statist. Soc. B 58, 3–57 (1996)
MATH Google Scholar
Wilcoxon, F.: Individual comparison by ranking methods. Biometrics 1, 80–83 (1945)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Artificial Intelligence, University of Granada, Spain
Joaquín Abellán & Andrés R. Masegosa

Authors

Joaquín Abellán
View author publications
You can also search for this author in PubMed Google Scholar
Andrés R. Masegosa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIB-CNR, Corso Stati Uniti 4, 35127, Padova, Italy
Claudio Sossai
ISIB-CNR, Corso Stati Uniti, 4, 35127, Padova, Italy
Gaetano Chemello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abellán, J., Masegosa, A.R. (2009). An Experimental Study about Simple Decision Trees for Bagging Ensemble on Datasets with Classification Noise. In: Sossai, C., Chemello, G. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2009. Lecture Notes in Computer Science(), vol 5590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02906-6_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-02906-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02905-9
Online ISBN: 978-3-642-02906-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics