Abstract
It is a very interesting and well-studied problem, given two point sets W, B \(\subseteq\)ℜn, to design a linear decision tree that classifies them —that is, no leaf subdivision contains points from both B and W — and is as simple as possible, either in terms of the total number of nodes, or in terms of its depth. We show that, unless ZPP=NP, the depth of a classifier cannot be approximated within a factor smaller than 6/5, and that the total number of nodes cannot be approximated within a factor smaller than n 1/5. Our proof relies on a simple connection between this problem and graph coloring, and uses recent nonapproximability results for graph coloring. We also study the problem of designing a classifier with a single inequality that involves as few variables as possible, and point out certain aspects of the difficulty of this problem.
Research partially supported by the National Science Foundation.
Preview
Unable to display preview. Download preview PDF.
References
G. Ausiello, D'Atri, and M. Protasi. Structure preserving reductions among convex optimization problems. Journal of Computer and System Sciences, 21:136–153, 1980.
S. Arora, C. Lund, R. Morwani, M. Sudan, M. Szegedy Proof verification and hardness of approximation problems Proc. 33rd FOCS, 1992.
Hans L. Bodlaender, Michael R. Fellows, and Michael T. Hallett. Beyond np-completeness for problems of bounded width: Hardness for the W hierarchy. In 26th Annual ACM Symposium on Theory of Computing (STOC), pages 449–458, 1994.
Leo Breiman, Jerome J. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth, 1984.
B. E. Boser, I. M. Guyon, and V. N. Vapnik. Automatic capacity tuning of very-large VC-dimension classifiers. Manuscript, 1994.
Bellare, M.; Goldreich, O.; Sudan, M. “Free bits, PCPs and non-approximability — towards tight results” in Proceedings. 36th Annual Symposium on Foundations of Computer Science pages 422–31, 1995. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, pages 144–52. ACM, 1992.
Avrim L. Blum and Ronald L. Rivest. Training a 3-node neural network is NP-complete. Neural Networks, 5:117–127, 1992.
Cai, Chen, Downey, and Fellows. On the structure of parameterized problems in NP (extended abstract). In Annual Symposium on Theoretical Aspects of Computer Science, 1994.
[DFK+94] Rodney G. Downey, Michael R. Fellows, Bruce M. Kapron, Michael T. Hallett, and H. Todd Wareham. The parameterized complexity of some problems in logic and linguistics. In Third International Symposium on Logical Foundations of Computer Science, pages 89–100. EATCS, Springer-Verlag, 1994.
Martin Fürer. Improved hardness results for approximating the chromatic number. Abstract 95-19 distributed at Structures, 1995.
Michael Goodrich, Vincent Mirelli, Mark Orletsky, and Jeffery Salowe. Decision tree construction in fixed dimensions: Being global is hard but local greed is good. Technical Report #1995.01, Johns Hopkins U. Computer Science Dept., 1995.
Simon Haykin. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing, 1994.
D. Heath, S. Kasif, and S. Salzberg. Learning oblique decision trees. In Proc. 13th International Joint Conference on Artificial Intelligence, pages 1002–1007. Morgan Kaufmann, 1993. Chambery, France.
L. Hyafil and R. L. Rivest. Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5, 1976.
M. Hallett and H. Wareham. A compendium of parameterized complexity results. SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 25, 1994. Also available online from ftp: //cs.uvic.ca/pub/W_hierarchy/.
C. Lund and M. Yannakakis. On the hardness of approximating minimization problems. In Proc. 25th Annual ACM Symposium on Theory of Computing (STOC), pages 286–293. ACM, 1993.
Nimrod Megiddo. On the complexity of polyhedral separability. Discrete Computational Geometry, 3:325–337, 1988.
Sreerama K. Murthy, Simon Kasif, and Steven Salzburg. A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2:1–33, 1994.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grigni, M., Mirelli, V., Papadimitriou, C.H. (1996). On the difficulty of designing good classifiers. In: Cai, JY., Wong, C.K. (eds) Computing and Combinatorics. COCOON 1996. Lecture Notes in Computer Science, vol 1090. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61332-3_161
Download citation
DOI: https://doi.org/10.1007/3-540-61332-3_161
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61332-9
Online ISBN: 978-3-540-68461-9
eBook Packages: Springer Book Archive