Skip to main content

On the difficulty of designing good classifiers

  • Session 8
  • Conference paper
  • First Online:
Computing and Combinatorics (COCOON 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1090))

Included in the following conference series:

  • 161 Accesses

Abstract

It is a very interesting and well-studied problem, given two point sets W, B \(\subseteq\)n, to design a linear decision tree that classifies them —that is, no leaf subdivision contains points from both B and W — and is as simple as possible, either in terms of the total number of nodes, or in terms of its depth. We show that, unless ZPP=NP, the depth of a classifier cannot be approximated within a factor smaller than 6/5, and that the total number of nodes cannot be approximated within a factor smaller than n 1/5. Our proof relies on a simple connection between this problem and graph coloring, and uses recent nonapproximability results for graph coloring. We also study the problem of designing a classifier with a single inequality that involves as few variables as possible, and point out certain aspects of the difficulty of this problem.

Research partially supported by the National Science Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G. Ausiello, D'Atri, and M. Protasi. Structure preserving reductions among convex optimization problems. Journal of Computer and System Sciences, 21:136–153, 1980.

    Article  Google Scholar 

  2. S. Arora, C. Lund, R. Morwani, M. Sudan, M. Szegedy Proof verification and hardness of approximation problems Proc. 33rd FOCS, 1992.

    Google Scholar 

  3. Hans L. Bodlaender, Michael R. Fellows, and Michael T. Hallett. Beyond np-completeness for problems of bounded width: Hardness for the W hierarchy. In 26th Annual ACM Symposium on Theory of Computing (STOC), pages 449–458, 1994.

    Google Scholar 

  4. Leo Breiman, Jerome J. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth, 1984.

    Google Scholar 

  5. B. E. Boser, I. M. Guyon, and V. N. Vapnik. Automatic capacity tuning of very-large VC-dimension classifiers. Manuscript, 1994.

    Google Scholar 

  6. Bellare, M.; Goldreich, O.; Sudan, M. “Free bits, PCPs and non-approximability — towards tight results” in Proceedings. 36th Annual Symposium on Foundations of Computer Science pages 422–31, 1995. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, pages 144–52. ACM, 1992.

    Google Scholar 

  7. Avrim L. Blum and Ronald L. Rivest. Training a 3-node neural network is NP-complete. Neural Networks, 5:117–127, 1992.

    Google Scholar 

  8. Cai, Chen, Downey, and Fellows. On the structure of parameterized problems in NP (extended abstract). In Annual Symposium on Theoretical Aspects of Computer Science, 1994.

    Google Scholar 

  9. [DFK+94] Rodney G. Downey, Michael R. Fellows, Bruce M. Kapron, Michael T. Hallett, and H. Todd Wareham. The parameterized complexity of some problems in logic and linguistics. In Third International Symposium on Logical Foundations of Computer Science, pages 89–100. EATCS, Springer-Verlag, 1994.

    Google Scholar 

  10. Martin Fürer. Improved hardness results for approximating the chromatic number. Abstract 95-19 distributed at Structures, 1995.

    Google Scholar 

  11. Michael Goodrich, Vincent Mirelli, Mark Orletsky, and Jeffery Salowe. Decision tree construction in fixed dimensions: Being global is hard but local greed is good. Technical Report #1995.01, Johns Hopkins U. Computer Science Dept., 1995.

    Google Scholar 

  12. Simon Haykin. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing, 1994.

    Google Scholar 

  13. D. Heath, S. Kasif, and S. Salzberg. Learning oblique decision trees. In Proc. 13th International Joint Conference on Artificial Intelligence, pages 1002–1007. Morgan Kaufmann, 1993. Chambery, France.

    Google Scholar 

  14. L. Hyafil and R. L. Rivest. Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5, 1976.

    Google Scholar 

  15. M. Hallett and H. Wareham. A compendium of parameterized complexity results. SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 25, 1994. Also available online from ftp: //cs.uvic.ca/pub/W_hierarchy/.

    Google Scholar 

  16. C. Lund and M. Yannakakis. On the hardness of approximating minimization problems. In Proc. 25th Annual ACM Symposium on Theory of Computing (STOC), pages 286–293. ACM, 1993.

    Google Scholar 

  17. Nimrod Megiddo. On the complexity of polyhedral separability. Discrete Computational Geometry, 3:325–337, 1988.

    Google Scholar 

  18. Sreerama K. Murthy, Simon Kasif, and Steven Salzburg. A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2:1–33, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Michelangelo Grigni or Christos H. Papadimitriou .

Editor information

Jin-Yi Cai Chak Kuen Wong

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grigni, M., Mirelli, V., Papadimitriou, C.H. (1996). On the difficulty of designing good classifiers. In: Cai, JY., Wong, C.K. (eds) Computing and Combinatorics. COCOON 1996. Lecture Notes in Computer Science, vol 1090. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61332-3_161

Download citation

  • DOI: https://doi.org/10.1007/3-540-61332-3_161

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61332-9

  • Online ISBN: 978-3-540-68461-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics