Skip to main content
Log in

Hyper-rectangular space partitioning trees: A practical approach

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

The process of computation of classification trees can be characterized as involving three basic choices: the type of splits considered in the growing process, the criterion to be optimized at each step of the process, and the way to get right-sized trees. Most implementations are ordinary binary trees, i.e. trees whose successive cuts are made by hyper-planes perpendicular to the axes. L. Devroye, L. Györfy and G. Lugosi (1996) define and consider the remarkable theoretical properties of a binary tree classifier whose prominent feature is the particular type of splits used in its construction: at a given node, partitioning is made by hyper-rectangles rather than hyper-planes. We propose an approximation of the solution for the complex optimization problem involved to allow insights on the practical advantages of those trees. Then we compare the performance of our algorithm with some leading algorithms for ordinary binary trees, namely CART and C4.5 as implemented in the Splus “tree” procedure and in SAS’s Enterprise Miner respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Similar content being viewed by others

References

  • Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984),Classification and Regression Trees., Wadsworth International, Belmont, CA.

    MATH  Google Scholar 

  • Buntine, W. & Caruana, R. (1992),Introduction to IND Version 2.1 and Recursive Partitioning, NASA Ames Research Center, Moffet Field, CA.

    Google Scholar 

  • Clark, L. & Pregibon, D. (1993), Tree-based models,in J. Chambers & T. Hastie, eds,‘Statistical Models in S’ Chapman & Hall, New York, NY, pp. 377–419.

    Google Scholar 

  • Devroye, L., Gyorfi, L. & Lugosi, G. (1996),A probabilistic theory of pattern recognition., Springer-Ver lag, New-York.

    Book  Google Scholar 

  • Esposito, F., Malerba, D. & Semeraro, G. (1997),‘A comparative analysis of methods for pruning decision trees.’IEEE Transactions on pattern analysis and machine intelligence 19(5), 476–491.

    Article  Google Scholar 

  • Friedman, J. & Fisher, N. (1999),‘Bump hunting in high-dimensional data’Statistics and Computing 9(2), 123–143.

    Article  Google Scholar 

  • Michie, D., Spiegelhalter, D. J. & Taylor, C. C. (1994),Machine Learning, neural and statistical classification., Ellis Horwood.

  • Muller, W. & Wysotzki, F. (1997), The decision tree algorithm CAL5 based on a statistical approach to its splitting algorithm,in‘Machine Learning and Statistics: The Interface’ John Wiley & Sons, New York, NY, pp. 45–65.

    Google Scholar 

  • Murphy, P. M. & Aha, D. W. (1996),UCI Repository of machine learning databases., Department of Information and Computer Science, University of California, Irvine, CA.

    Google Scholar 

  • Quinlan, J. R. (1993),C4.5: Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Shih, Y.-S., Lim, T.-S. & Loh, W.-Y. (2000),‘A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms’Machine Learning 40, 203–228.

    Article  Google Scholar 

  • Venables, W. & Ripley, B. (1994),Modern Applied Statistics with S-Plus, New-York, NY: Springer-Verlag.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Research support from “Projet d’Actions de Recherche Concertées” (No. 98/03-217) and from the “Interuniversity Attraction Pole“, Phase V (No. P5/24) from the Belgian Government are also acknowledged.

Rights and permissions

Reprints and permissions

About this article

Cite this article

De Macq, I., Simar, L. Hyper-rectangular space partitioning trees: A practical approach. Computational Statistics 20, 119–135 (2005). https://doi.org/10.1007/BF02736126

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02736126

Keywords

Navigation