Abstract
We introduce a new algorithm building an optimal dyadic decision tree (ODT). The method combines guaranteed performance in the learning theoretical sense and optimal search from the algorithmic point of view. Furthermore it inherits the explanatory power of tree approaches, while improving performance over classical approaches such as CART/C4.5, as shown on experiments on artificial and benchmark data.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
References
Adelson-Velskii, G. M., & Landis, E. M. (1962). An algorithm for the organization of information. Soviet Math. Doclady, 3, 1259–1263.
Barron, A., Birgé, L., & Massart, P. (1999). Risk bounds for model selection via penalization. Probability Theory and Related Fields, 113, 301–413.
Barron, A., & Sheu, C. (1991). Approximation of density functions by sequences of exponential families. Annals of Statistics, 19, 1347–1369.
Bartlett, P., Bousquet, O., & Mendelson, S. (2005). Local Rademacher complexities. Annals of Statistics, 33(4), 1497–1537.
Blanchard, G. (2004). Different paradigms for choosing sequential reweighting algorithms. Neural Computation, 16, 811–836.
Blanchard, G., Bousquet, O., & Massart, P. (2004). Statistical performance of support Vector Machines. Submitted manuscript.
Blanchard, G., Schäfer, C., & Rozenholc, Y. (2004). Oracle bounds and exact algorithm for dyadic classification trees. In J. Shawe-Taylor & Y. Singer (Eds.), Proceedings of the 17th Conference on Learning Theory (COLT 2004), number 3210 in lectures notes in artificial intelligence (pp. 378–392). Springer.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Breiman, L., Friedman, J., Olshen, J., & Stone, C. (1984). Classification and Regression Trees. Wadsworth.
Castellan, G. (2000). Histograms selection with an Akaike type criterion. C. R. Acad. Sci., Paris, Sér. I, Math., 330(8), 729–732.
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley series in telecommunications. J. Wiley.
Devroye, L., Györfi, L., & Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Number 31 in Applications of Mathematics. New York: Springer.
Donoho, D. (1997). Cart and best-ortho-basis: A connection. Annals of Statistics, 25, 1870–1911.
Gey, S., & Nédélec, E. (2005). Model selection for CART regression trees. IEEE Transactions on Information Theory, 51(2), 658–670.
Györfi, L., Kohler, M., & Krzyzak, A. (2002). A distribution-free theory of nonparametric regression. Springer series in statistics. Springer.
Klemelä, J. (2003). Multivariate histograms with data-dependent partitions. Technical report, Institut für angewandte Mathematik, Universität Heidelberg.
Massart, P. (2000). Some applications of concentration inequalities in statistics. Ann. Fac. Sci. Toulouse Math., 9(2), 245–303.
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., & Müller, K.-R. (1999). Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson & S. Douglas (Eds.), Neural networks for signal processing IX (pp. 41–48). IEEE.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo.
Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320. also NeuroCOLT Technical Report NC-TR-1998-021.
Scott, C., & Nowak, R. (2004). Near-minimax optimal classification with dyadic classification trees. In S. Thrun, L. Saul & B. Schölkopf (Eds.), Advances in neural information processing systems 16. Cambridge, MA: MIT Press.
Scott, C., & Nowak, R. (2006). Minimax optimal classification with dyadic decision trees. IEEE Transactions on Information Theory, 52(4), 1335–1353.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Olivier Bousquet and Andre Elisseeff
Rights and permissions
About this article
Cite this article
Blanchard, G., Schäfer, C., Rozenholc, Y. et al. Optimal dyadic decision trees. Mach Learn 66, 209–241 (2007). https://doi.org/10.1007/s10994-007-0717-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-007-0717-6