Skip to main content

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

  • Conference paper
  • First Online:
Integration of AI and OR Techniques in Constraint Programming (CPAIOR 2017)

Abstract

We encode the problem of learning the optimal decision tree of a given depth as an integer optimization problem. We show experimentally that our method (DTIP) can be used to learn good trees up to depth 5 from data sets of size up to 1000. In addition to being efficient, our new formulation allows for a lot of flexibility. Experiments show that we can use the trees learned from any existing decision tree algorithms as starting solutions and improve the trees using DTIP. Moreover, the proposed formulation allows us to easily create decision trees with different optimization objectives instead of accuracy and error, and constraints can be added explicitly during the tree construction phase. We show how this flexibility can be used to learn discrimination-aware classification trees, to improve learning from imbalanced data, and to learn trees that minimise false positive/negative errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bennett, K.P., Blue, J.A.: Optimal decision trees. Technical report., R.P.I. Math Report No. 214, Rensselaer Polytechnic Institute (1996)

    Google Scholar 

  2. Bertsimas, D., Shioda, R.: Classification and regression via integer optimization. Oper. Res. 55(2), 252–271 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  3. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    MATH  Google Scholar 

  4. Bessiere, C., Hebrard, E., O’Sullivan, B.: Minimising decision tree size as combinatorial optimisation. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 173–187. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04244-7_16

    Chapter  Google Scholar 

  5. Bruynooghe, M., Blockeel, H., Bogaerts, B., De Cat, B., De Pooter, S., Jansen, J., Labarre, A., Ramon, J., Denecker, M., Verwer, S.: Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with idp3. Theor. Pract. Logic Program. 15(06), 783–817 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Carrizosa, E., Morales, D.R.: Supervised classification and mathematical optimization. Comput. Oper. Res. 40(1), 150–165 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47(4), 547–553 (2009)

    Article  Google Scholar 

  8. De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for data mining and machine learning. In: AAAI, pp. 1671–1675 (2010)

    Google Scholar 

  9. Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, Cambridge (2012)

    Book  MATH  Google Scholar 

  10. Heule, M.J., Verwer, S.: Software model synthesis using satisfiability solvers. Empirical Softw. Eng. 18(4), 825–856 (2013)

    Article  Google Scholar 

  11. Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is np-complete. Inf. Proc. Lett. 5(1), 15–17 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  12. Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml

  13. Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)

    Article  Google Scholar 

  14. Norouzi, M., Collins, M.D., Johnson, M., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. In: NIPS, pp. 1729–1737. MIT Press (2015)

    Google Scholar 

  15. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  16. Verwer, S., Zhang, Y., Ye, Q.C.: Auction optimization using regression trees and linear models as integer programs. Artif. Intell. 244, 368–395 (2017)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work is partially funded by Technologiestichting STW VENI project 13136 (MANTA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingqian Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Verwer, S., Zhang, Y. (2017). Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization. In: Salvagnin, D., Lombardi, M. (eds) Integration of AI and OR Techniques in Constraint Programming. CPAIOR 2017. Lecture Notes in Computer Science(), vol 10335. Springer, Cham. https://doi.org/10.1007/978-3-319-59776-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59776-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59775-1

  • Online ISBN: 978-3-319-59776-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics