Abstract
Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and non-convex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction problems. Tree search algorithms come with a variety of tunable parameters that are notoriously challenging to tune by hand. A growing body of research has demonstrated the power of using a data-driven approach to automatically optimize the parameters of tree search algorithms. These techniques use a training set of integer programs sampled from an application-specific instance distribution to find a parameter setting that has strong average performance over the training set. However, with too few samples, a parameter setting may have strong average performance on the training set but poor expected performance on future integer programs from the same application. Our main contribution is to provide the first sample complexity guarantees for tree search parameter tuning. These guarantees bound the number of samples sufficient to ensure that the average performance of tree search over the samples nearly matches its future expected performance on the unknown instance distribution. In particular, the parameters we analyze weight scoring rules used for variable selection. Proving these guarantees is challenging because tree size is a volatile function of these parameters: we prove that, for any discretization (uniform or not) of the parameter space, there exists a distribution over integer programs such that every parameter setting in the discretization results in a tree with exponential expected size, yet there exist parameter settings between the discretized points that result in trees of constant size. In addition, we provide data-dependent guarantees that depend on the volatility of these tree-size functions: our guarantees improve if the tree-size functions can be well approximated by simpler functions. Finally, via experiments, we illustrate that learning an optimal weighting of scoring rules reduces tree size.
- 2007. Constraint Integer Programming. Ph.D. Dissertation. Technische Universität Berlin.Google Scholar .
- 2009. SCIP: Solving constraint integer programs. Mathematical Programming Computation 1, 1 (2009), 1–41.Google ScholarCross Ref .
- 2005. Branching rules revisited. Operations Research Letters 33, 1 (
January 2005), 42–54.Google ScholarDigital Library . - 2022. Setting fair incentives to maximize improvement. arXiv preprint arXiv:2203.00134 (2022).Google Scholar .
- 2017. A machine learning-based approximation of strong branching. INFORMS Journal on Computing 29, 1 (2017), 185–195.Google ScholarDigital Library .
- 2009. A gender-based genetic algorithm for the automatic configuration of algorithms. In Proceedings of the International Conference on Principles and Practice of Constraint Programming. Springer-Verlag, 142–157.Google ScholarCross Ref .
- 2009. Neural Network Learning: Theoretical Foundations. Cambridge University Press.Google Scholar .
- 2020. Data-driven algorithm design. In Beyond Worst Case Analysis of Algorithms, (Ed.). Cambridge University Press.
(Forthcoming). Google Scholar . - 2018a. Learning to branch. In Proceedings of theInternational Conference on Machine Learning (ICML) (2018).Google Scholar .
- 2018b. Dispersion for data-driven algorithm design, online learning, and private optimization. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS’18).Google ScholarCross Ref .
- 2022. Provably tuning the ElasticNet across instances. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’22).Google Scholar .
- 2017. Learning-theoretic foundations of algorithm configuration for combinatorial partitioning problems. In Proceedings of the Conference on Learning Theory (COLT’17).Google Scholar .
- 2020a. Learning to link. In Proceedings of the International Conference on Learning Representations (ICLR’20).Google Scholar .
- 2020b. Semi-bandit optimization in the dispersed setting. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI’20).Google Scholar .
- 2020c. Learning to optimize computational resources: Frugal training with generalization guarantees. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20).Google ScholarCross Ref .
- 2020d. Refined bounds for algorithm configuration: The knife-edge of dual class approximability. In International Conference on Machine Learning (ICML’20).Google Scholar .
- 2021a. Generalization in portfolio-based algorithm selection. In AAAI Conference on Artificial Intelligence (AAAI’21).Google ScholarCross Ref .
- 2021b. Learning-to-learn non-convex piecewise-Lipschitz functions. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’21).Google Scholar .
- 2021. Data driven semi-supervised learning. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS) (2021).Google Scholar .
- 2022. Generalization bounds for data-driven numerical linear algebra. In Proceedings of the Conference on Learning Theory (COLT’22).Google Scholar .
- 1979. Branch and bound methods for mathematical programming systems. Annals of Discrete Mathematics 5 (1979), 201–219.Google ScholarCross Ref .
- 2020a. A learning-based algorithm to quickly compute good primal solutions for stochastic integer programs. In Proceedings of the International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR’20).Google ScholarDigital Library .
- 2020b. Machine learning for combinatorial optimization: A methodological tour d’horizon. European Journal of Operational Research (2020).Google Scholar .
- 1971. Experiments in mixed-integer linear programming. Mathematical Programming 1, 1 (1971), 76–94.Google ScholarDigital Library .
- 1996. MAC and combined heuristics: Two reasons to forsake FC (and CBJ?) on hard problems. In Proceedings of the International Conference on Principles and Practice of Constraint Programming. Springer, 61–75.Google ScholarDigital Library .
- 2021. Learning complexity of simulated annealing. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’21).Google Scholar .
- 2018. Learning a classification of mixed-integer quadratic programming problems. In Proceedings of the International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR’18). 595–604.Google ScholarCross Ref .
- 2017. Online optimization of smoothed piecewise constant functions. In Proceedings of theInternational Conference on Artificial Intelligence and Statistics (AISTATS’17).Google Scholar .
- 2021a. Branch-and-bound solves random binary IPS in polytime. In Proceedings of theAnnual ACM-SIAM Symposium on Discrete Algorithms (SODA’21).Google ScholarCross Ref .
- 2021b. Lower bounds on the size of general branch-and-bound trees. arXiv preprint arXiv:2103.09807 (2021).Google Scholar .
- 2016. DASH: Dynamic approach for switching heuristics. European Journal of Operational Research 248, 3 (2016), 943–953.Google ScholarCross Ref .
- 2019. Exact combinatorial optimization with graph convolutional neural networks. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’19). 15554–15566.Google Scholar .
- 1977. Experiments in mixed-integer linear programming using pseudo-costs. Mathematical Programming 12, 1 (1977), 26–47.Google ScholarDigital Library .
- 2011. Information-theoretic approaches to branching in search. Discrete Optimization 8, 2 (2011), 147–159.
Early version in IJCAI-07. Google ScholarDigital Library . - 2001. Algorithm portfolios. Artificial Intelligence 126 (2001), 43–62.Google ScholarDigital Library .
- 2017. A PAC approach to application-specific algorithm selection. SIAM J. Comput. 46, 3 (2017), 992–1017.Google ScholarCross Ref .
- 1980. Increasing tree search efficiency for constraint satisfaction problems. Artificial Intelligence 14, 3 (1980), 263–313.Google ScholarCross Ref .
- 2014. Learning to search in branch and bound algorithms. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’14).Google Scholar .
- 2001. A Bayesian approach to tackling hard computational problems. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI’01).Google Scholar .
- 2019. Learning-based frequency estimation algorithms. In Proceedings of the International Conference on Learning Representations (ICLR’19).Google Scholar .
- 2010. Automated configuration of mixed integer programming solvers. In Proceedings of the International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming. Springer, 186–202.Google ScholarDigital Library .
- 2011. Sequential model-based optimization for general algorithm configuration. In Proc. of LION-5. 507–523.Google Scholar .
- 2009. ParamILS: An automatic algorithm configuration framework. Journal of Artificial Intelligence Research 36, 1 (2009), 267–306.Google ScholarCross Ref .
- 2014. Algorithm runtime prediction: Methods and evaluation. Artificial Intelligence 206 (2014), 79–111.
DOI: Google ScholarDigital Library . - 2007. CPLEX 11.0 Release Notes.Google Scholar .
- 2017. CPLEX 12.8 Parameters Reference.Google Scholar .
- 1974. Trivial integer programs unsolvable by branch-and-bound. Mathematical Programming 6, 1 (1974), 105–109.Google ScholarDigital Library .
- 2010. ISAC-instance-specific algorithm configuration. In Proceedings of the European Conference on Artificial Intelligence (ECAI’10).Google Scholar .
- 2016. Learning to branch in mixed integer programming. In Proceedings of theAAAI Conference on Artificial Intelligence (AAAI’16).Google ScholarCross Ref .
- 2017. Learning to run heuristics in tree search. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’17).Google ScholarCross Ref .
- 2017. Efficiency through procrastination: Approximately optimal algorithm configuration with runtime guarantees. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’17).Google ScholarCross Ref .
- 2019. Procrastinating with confidence: Near-optimal, anytime, adaptive algorithm configuration. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS) (2019).Google Scholar .
- 1967. A branch and bound algorithm for the knapsack problem. Management Science 13, 9 (1967), 723–735.Google ScholarDigital Library .
- 2001. Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory 47, 5 (2001), 1902–1914.Google ScholarDigital Library .
- 2017. Learning when to use a decomposition. In Proceedings of the International Conference on AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. Springer, 202–210.Google ScholarCross Ref .
- 2001. Learning to select branching rules in the DPLL procedure for satisfiability. Electronic Notes in Discrete Mathematics 9 (2001), 344–359.Google ScholarCross Ref .
- 1960. An automatic method of solving discrete programming problems. Econometrica: Journal of the Econometric Society (1960), 497–520.Google ScholarCross Ref .
- 2017. An abstract model for branching and its application to mixed integer programming. Mathematical Programming (2017), 1–37.Google Scholar .
- 2017. Economics and computer science of a radio spectrum reallocation. Proceedings of the National Academy of Sciences 114, 28 (2017), 7202–7209.Google ScholarCross Ref .
- 2009. Empirical hardness models: Methodology and a case study on combinatorial auctions. J. ACM 56, 4 (2009), 1–52.Google ScholarDigital Library .
- 2000. Towards a universal test suite for combinatorial auction algorithms. In Proceedings of the ACM Conference on Electronic Commerce (ACM-EC’00). Minneapolis, MN, 66–76.Google ScholarDigital Library .
- 2016. Learning rate based branching heuristic for SAT solvers. In Proceedings of theInternational Conference on Theory and Applications of Satisfiability Testing. Springer, 123–140.Google ScholarCross Ref .
- 2000. On the complexity of choosing the branching literal in DPLL. Artificial Intelligence 116, 1-2 (2000), 315–326.Google ScholarDigital Library .
- 1999. A computational study of search strategies for mixed integer programming. INFORMS Journal of Computing 11 (1999), 173–187.Google ScholarCross Ref .
- 1998. Branch and bound algorithm selection by performance prediction. In Proceedings of the National Conference on Artificial Intelligence (AAAI’98). San Jose, CA, 353–358.Google Scholar .
- 2018. Competitive caching with machine learned advice. In Proceedings of the International Conference on Machine Learning (ICML’18).Google Scholar .
- 2000. Some applications of concentration inequalities to statistics. Annales de la Faculté des Sciences de Toulouse 9 (2000), 245–303.Google ScholarCross Ref .
- 2018. A model for learned bloom filters and optimizing by sandwiching. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’18). 464–473.Google Scholar .
- 2012. Foundations of Machine Learning. MIT Press.Google ScholarDigital Library .
- 2018. Improving online algorithms via ML predictions. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’18). 9661–9670.Google Scholar .
- 2018. Algorithm configuration landscapes. In Proceedings of the International Conference on Parallel Problem Solving from Nature. Springer, 271–283.Google ScholarCross Ref .
- 1976. The algorithm selection problem. Advances in Computers 15 (1976), 65–118.Google ScholarCross Ref .
- 2017. Guiding combinatorial optimization with UCT. In Proceedings of theInternational Conference on AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. Springer.Google Scholar .
- 2022. Sample complexity of learning heuristic functions for greedy-best-first and A* search. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’22).Google Scholar .
- 2002. Algorithm for optimal winner determination in combinatorial auctions. Artificial Intelligence 135 (
Jan. 2002), 1–54.Google ScholarDigital Library . - 2013. Very-large-scale generalized combinatorial multi-attribute auctions: Lessons from conducting $60 billion of sourcing. In Handbook of Market Design, , , and (Eds.). Oxford University Press.Google Scholar .
- 2006. Combining multiple heuristics. In Proceedings of the Annual Symposium on Theoretical Aspects of Computer Science. Springer, 242–253.Google ScholarDigital Library .
- 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.Google ScholarCross Ref .
- 2020. Learning piecewise Lipschitz functions in changing environments. In Proceedings of theInternational Conference on Artificial Intelligence and Statistics (AISTATS’20).Google Scholar .
- 2020. A general large neighborhood search framework for solving integer programs. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’20).Google Scholar .
- 2018. Learning to search via retrospective imitation. arXiv preprint arXiv:1804.00846 (2018).Google Scholar .
- 2021. Learning heuristic selection with dynamic algorithm configuration. In International Conference on Automated Planning and Scheduling (ICAPS’21), Vol. 31. 597–605.Google ScholarCross Ref .
- 2004. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. Journal of the ACM (JACM) 51, 3 (2004), 385–463.Google ScholarDigital Library .
- 2009. An online algorithm for maximizing submodular functions. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS’09). 1577–1584.Google Scholar .
- 2007. Combining multiple heuristics online. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’07).Google Scholar .
- 2020. Reinforcement learning for integer programming: Learning to cut. In Proceedings of theInternational Conference on Machine Learning (ICML) (2020).Google Scholar .
- 2018. LeapsAndBounds: A method for approximately optimal algorithm configuration. In Proceedings of the International Conference on Machine Learning (ICML’18).Google Scholar .
- 2019. CapsAndRuns: An improved method for approximately optimal algorithm configuration. In Proceedings of the International Conference on Machine Learning (ICML’19).Google Scholar .
- 2018. Learning robust search strategies using a bandit-based approach. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’18).Google ScholarCross Ref .
- 2008. SATzilla: Portfolio-based algorithm selection for SAT. Journal of Artificial Intelligence Research 32, 1 (2008), 565–606.Google ScholarCross Ref .
- 2010. Hydra: Automatically configuring algorithms for portfolio-based selection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’10).Google ScholarCross Ref .
- 2011. Hydra-MIP: Automated algorithm configuration and selection for mixed integer programming. In Proceedings of the RCRA Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion at the International Joint Conference on Artificial Intelligence (IJCAI’11).Google Scholar .
Index Terms
- Learning to Branch: Generalization Guarantees and Limits of Data-Independent Discretization
Recommendations
Branch-And-Price: Column Generation for Solving Huge Integer Programs
We discuss formulations of integer programs with a huge number of variables and their solution by column generation methods, i.e., implicit pricing of nonbasic variables to generate new columns or to prove LP optimality at a node of the branch-and-bound ...
Numerical Experience with Lower Bounds for MIQP Branch-And-Bound
The solution of convex mixed-integer quadratic programming (MIQP) problems with a general branch-and-bound framework is considered. It is shown how lower bounds can be computed efficiently during the branch-and-bound process. Improved lower bounds such ...
Integrating SQP and Branch-and-Bound for Mixed Integer Nonlinear Programming
This paper considers the solution of Mixed Integer Nonlinear Programming (MINLP) problems. Classical methods for the solution of MINLP problems decompose the problem by separating the nonlinear part from the integer part. This approach is largely due to ...
Comments