Abstract
Induction of decision trees is one of the most successful approaches to supervised machine learning. Branching programs are a generalization of decision trees and, by the boosting analysis, exponentially more efficiently learnable than decision trees. However, this advantage has not been seen to materialize in experiments. Decision trees are easy to simplify using pruning. Reduced error pruning is one of the simplest decision tree pruning algorithms. For branching programs no pruning algorithms are known. In this paper we prove that reduced error pruning of branching programs is infeasible. Finding the optimal pruning of a branching program with respect to a set of pruning examples that is separate from the set of training examples is NP-complete. Because of this intractability result, we have to consider approximating reduced error pruning. Unfortunately, it turns out that even finding an approximate solution of arbitrary accuracy is computationally infeasible. In particular, reduced error pruning of branching programs is APX-hard. Our experiments show that, despite the negative theoretical results, heuristic pruning of branching programs can reduce their size without significantly altering the accuracy.
Similar content being viewed by others
References
G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela and M. Protasi, Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties (Springer, Berlin, 1999).
C.L. Blake and C.J. Merz, UCI repository of machine learning databases, University of California, Department of Information and Computer Science, Irvine, CA (1998). Available at http://www.ics.uci.edu/?mlearn/MLRepository.html
P. Crescenzi, V. Kann, R. Silvestri and L. Trevisan, Structure in approximation classes, SIAM Journal on Computing 28(5) (1999) 1759–1782.
T. Elomaa and M. Kääriäinen, An analysis of reduced error pruning, Journal of Artificial Intelligence Research 15 (2001) 163–187.
T. Elomaa and M. Kääriäinen, On the practice of branching program boosting, in: Proceedings of the Twelfth European Conference on Machine Learning, Lecture Notes in Artificial Intelligence, Vol. 2167, eds. L. De Raedt and P. Flach (Springer, Berlin, 2001) pp. 133–144.
U. Feige and M. Goemans, Approximating the value of two prover proof systems, with applications to MAX 2SAT and MAX DICUT, in: Proceedings of the Third Israel Symposium on the Theory of Computing and Systems, Los Alamitos, CA (1995) pp. 182–189.
Y. Freund, Boosting a weak learning algorithm by majority, Information and Computation 121(2) (1995) 256–285.
Y. Freund and R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55(1) (1997) 119–139.
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NPCompleteness (Freeman, New York, 1979).
M.R. Garey, D.S. Johnson and L. Stockmeyer, Some simplified NP-complete graph problems, Theoretical Computer Science 1(3) (1976) 237–267.
J. Håstad, Some optimal inapproximability results, Journal of the ACM 48(4) (2001) 798–859.
R. Kohavi, Bottom-up induction of oblivious read-once decision graphs, in: Machine Learning: Proceedings of the Seventh European Conference, Lecture Notes in Artificial Intelligence, Vol. 784, eds. F. Bergadano and L. De Raedt (Springer, Berlin, 1994) pp. 154–169.
R. Kohavi, Bottom-up induction of oblivious read-once decision graphs: Strengths and limitations, in: Proceedings of the Twelfth National Conference on Artificial Intelligence (Menlo Park, CA, 1994) pp. 613–618.
R. Kohavi and C.-H. Li, Oblivious decision trees, graphs, and top-down pruning, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, San Francisco, CA (1995) pp. 1071–1079.
Y. Mansour and D. McAllester, Boosting using branching programs, Journal of Computer and System Sciences 64(1) (2002) 103–112.
J. Mingers, An empirical comparison of pruning methods for decision tree induction, Machine Learning 4(2) (1989) 227–243.
J.J. Oliver, Decision graphs-an extension of decision trees, in: Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics (1993) pp. 343–350.
C.H. Papadimitriou and M. Yannakakis, Optimization, approximation, and complexity classes, Journal of Computer and System Sciences 43(3) (1991) 425–440.
J.R. Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies 27(3) (1987) 221–248.
J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, CA, 1993).
R.E. Schapire, The strength of weak learnability, Machine Learning 5(2) (1990) 197–227.
E. Takimoto and M.K. Warmuth, Predicting nearly as well as the best pruning of a planar decision graph, Theoretical Computer Science 288(2) (2002) 217–235.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Elomaa, T., Kääriäinen, M. The Difficulty of Reduced Error Pruning of Leveled Branching Programs. Annals of Mathematics and Artificial Intelligence 41, 111–124 (2004). https://doi.org/10.1023/B:AMAI.0000018579.44321.6a
Issue Date:
DOI: https://doi.org/10.1023/B:AMAI.0000018579.44321.6a