Abstract
This paper presents Newton trees, a redefinition of probability estimation trees (PET) based on a stochastic understanding of decision trees that follows the principle of attraction (relating mass and distance through the Inverse Square Law). The structure, application and the graphical representation of Newton trees provide a way to make their stochastically driven predictions compatible with user’s intelligibility, so preserving one of the most desirable features of decision trees, comprehensibility. Unlike almost all existing decision tree learning methods, which use different kinds of partitions depending on the attribute datatype, the construction of prototypes and the derivation of probabilities from distances are identical for every datatype (nominal and numerical, but also structured). We present a way of graphically representing the original stochastic probability estimation trees using a user-friendly gravitation simile.We include experiments showing that Newton trees outperform other PETs in probability estimation and accuracy.
This work has been partially supported by the EU (FEDER) and the Spanish MEC/MICINN, under grant TIN 2007-68093-C02 and the Spanish project “Agreement Technologies” (Consolider Ingenio CSD2007-00022).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alvarez, I., Bernard, S.: Ranking cases with decision trees: a geometric method that preserves intelligibility. In: IJCAI, pp. 635–640 (2005)
Alvarez, I., Bernard, S., Deffuant, G.: Keep the decision tree and estimate the class probabilities using its decision boundary. In: IJCAI, pp. 654–659 (2007)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
Buntine, W.: Learning classification trees. Stats. and Computing 2(2), 63–73 (1992)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Ferri, C., Flach, P., Hernandez-Orallo, J.: Improving the auc of probabilistic estimation trees. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 121–132. Springer, Heidelberg (2003)
Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30(1), 27–38 (2009)
Frank, A., Asuncion, A.: UCI Machine Learning Repository (2010)
Gomez, J., Dasgupta, D., Nasraoui, O.: A new gravitational clustering algorithm. In: Int. Conf. on Data Mining. Society for Industrial & Applied, p. 83 (2003)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software. SIGKDD Explorations 11(1), 10–18 (2009)
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45(2), 171–186 (2001)
Hespos, R.F., Strassmann, P.A.: Stochastic decision trees for the analysis of investment decisions. Management Science 11(10), 244–259 (1965)
Huang, J., Ling, C.X.: Using auc and accuracy in evaluating learning algorithms - appendices. IEEE Trans. Knowl. Data Eng. 17(3) (2005)
Ling, C.X., Yan, R.J.: Decision tree with better ranking. In: International Conference on Machine Learning, vol. 20(2), p. 480 (2003)
Martinez-Plumed, F., Estruch, V., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Newton trees. extended report. Technical report, DSIC, UPV (2010), http://www.dsic.upv.es/~flip/NewtonTR.pdf
Peng, L., Yang, B., Chen, Y., Abraham, A.: Data gravitation based classification. Information Sciences 179(6), 809–819 (2009)
Foster, J.: Provost and Pedro Domingos. Tree induction for probability-based ranking. Machine Learning 52(3), 199–215 (2003)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Rokach, L., Maimon, O.: Data Mining with Decision Trees: Theory and Applications. World Scientific, Singapore (2008)
Thornton, C.J.: Truth from trash: how learning makes sense. The MIT Press, Cambridge (2000)
Turlach, B.A.: Bandwidth selection in kernel density estimation: A review. In: CORE and Institut de Statistique (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martínez-Plumed, F., Estruch, V., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J. (2010). Newton Trees. In: Li, J. (eds) AI 2010: Advances in Artificial Intelligence. AI 2010. Lecture Notes in Computer Science(), vol 6464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17432-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-17432-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17431-5
Online ISBN: 978-3-642-17432-2
eBook Packages: Computer ScienceComputer Science (R0)