Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Verwer, Sicco; Zhang, Yingqian

doi:10.1007/978-3-319-59776-8_8

Sicco Verwer¹⁵ &
Yingqian Zhang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10335))

Included in the following conference series:

International Conference on AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems

2258 Accesses
18 Citations

Abstract

We encode the problem of learning the optimal decision tree of a given depth as an integer optimization problem. We show experimentally that our method (DTIP) can be used to learn good trees up to depth 5 from data sets of size up to 1000. In addition to being efficient, our new formulation allows for a lot of flexibility. Experiments show that we can use the trees learned from any existing decision tree algorithms as starting solutions and improve the trees using DTIP. Moreover, the proposed formulation allows us to easily create decision trees with different optimization objectives instead of accuracy and error, and constraints can be added explicitly during the tree construction phase. We show how this flexibility can be used to learn discrimination-aware classification trees, to improve learning from imbalanced data, and to learn trees that minimise false positive/negative errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bennett, K.P., Blue, J.A.: Optimal decision trees. Technical report., R.P.I. Math Report No. 214, Rensselaer Polytechnic Institute (1996)
Google Scholar
Bertsimas, D., Shioda, R.: Classification and regression via integer optimization. Oper. Res. 55(2), 252–271 (2007)
Article MathSciNet MATH Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
MATH Google Scholar
Bessiere, C., Hebrard, E., O’Sullivan, B.: Minimising decision tree size as combinatorial optimisation. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 173–187. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04244-7_16
Chapter Google Scholar
Bruynooghe, M., Blockeel, H., Bogaerts, B., De Cat, B., De Pooter, S., Jansen, J., Labarre, A., Ramon, J., Denecker, M., Verwer, S.: Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with idp3. Theor. Pract. Logic Program. 15(06), 783–817 (2015)
Article MathSciNet MATH Google Scholar
Carrizosa, E., Morales, D.R.: Supervised classification and mathematical optimization. Comput. Oper. Res. 40(1), 150–165 (2013)
Article MathSciNet MATH Google Scholar
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47(4), 547–553 (2009)
Article Google Scholar
De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for data mining and machine learning. In: AAAI, pp. 1671–1675 (2010)
Google Scholar
Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, Cambridge (2012)
Book MATH Google Scholar
Heule, M.J., Verwer, S.: Software model synthesis using satisfiability solvers. Empirical Softw. Eng. 18(4), 825–856 (2013)
Article Google Scholar
Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is np-complete. Inf. Proc. Lett. 5(1), 15–17 (1976)
Article MathSciNet MATH Google Scholar
Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml
Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)
Article Google Scholar
Norouzi, M., Collins, M.D., Johnson, M., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. In: NIPS, pp. 1729–1737. MIT Press (2015)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Verwer, S., Zhang, Y., Ye, Q.C.: Auction optimization using regression trees and linear models as integer programs. Artif. Intell. 244, 368–395 (2017)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work is partially funded by Technologiestichting STW VENI project 13136 (MANTA).

Author information

Authors and Affiliations

Delft University of Technology, Delft, The Netherlands
Sicco Verwer
Eindhoven University of Technology, Eindhoven, The Netherlands
Yingqian Zhang

Authors

Sicco Verwer
View author publications
You can also search for this author in PubMed Google Scholar
Yingqian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingqian Zhang .

Editor information

Editors and Affiliations

University of Padua, Padua, Italy
Domenico Salvagnin
University of Bologna, Bologna, Italy
Michele Lombardi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verwer, S., Zhang, Y. (2017). Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization. In: Salvagnin, D., Lombardi, M. (eds) Integration of AI and OR Techniques in Constraint Programming. CPAIOR 2017. Lecture Notes in Computer Science(), vol 10335. Springer, Cham. https://doi.org/10.1007/978-3-319-59776-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-59776-8_8
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59775-1
Online ISBN: 978-3-319-59776-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics