Interfaces with Other Disciplines
Automatic synthesis of constraints from examples using mixed integer linear programming

https://doi.org/10.1016/j.ejor.2017.02.034Get rights and content

Highlights

  • We synthesize constraints from examples of feasible and infeasible solutions.

  • Resulting constraints are represented symbolically and can be used to solve a problem.

  • The trade-off between the complexity and number of constrains can be controlled.

  • Extensive experiment on synthetic benchmarks proves plausibility of the method.

  • A case study on concrete slump test proves method’s viability in real-world scenarios.

Abstract

Constraints form an essential part of most practical search and optimization problems, and are usually assumed to be given. However, there are plausible real-world scenarios in which constraints are not known or can be only approximated, for instance when the process in question is complex and/or noisy. To address such problems, we propose a method that synthesizes constrains from examples of feasible and infeasible solutions. The method can produce linear, quadratic and trigonometric constraints that are guaranteed to separate the feasible and infeasible regions and minimize the number of terms involved. The synthesized constraints are represented symbolically and can be used to simulate, predict or optimize the original process. We assess empirically several characteristics of the method on three benchmarks, in particular the fidelity and the complexity of the synthesized constraints with respect to the actual constraints. We also demonstrate its application to a real-world process of concrete manufacturing. Experiments demonstrate that the method is capable of producing human-readable constraints that reflect well the underlying process and can be used to simulate it.

Introduction

Computer-assisted optimization of business processes is nowadays indispensable. In highly competitive environments, being even slightly more efficient than the competitors may bring essential leverage. Companies routinely optimize their manufacturing, supply chains, pricing and marketing, and use for that purpose sophisticated mathematical models and optimization algorithms. An inherent part of process optimization is building a model – a formal description of variables that control and characterize the process, and relationships between them. The representation of a model may vary depending on the assumptions made about the underlying process (e.g., whether it is linear or nonlinear) and on the technical requirements, e.g., the adopted optimization technique or tool. A common representation is Mathematical Programming (MP) problem (Williams, 2013), i.e., a formal object consisting of an objective function, constraints and variables.

A common practice is to design MP problems manually, which requires deep insight into the modeled process as well as extensive knowledge on modeling techniques. Few experts combine these competences: the practitioners with in-depth knowledge on the underlying process can be unfamiliar with mathematical models, while the optimization experts may not know the business process well enough. Also, real-world processes are often complex, involve many variables, parameters, implicit assumptions and non-obvious constraints. In consequence, the handcrafted models may not match the reality well enough and their optimization may lead to results that are not applicable in practice. Such models require revision and modeling becomes time consuming and expensive.

An appealing alternative to manual modeling that gained popularity in recent years is learning a model from process behavior. A sample of input variables with the associated values of the dependent (output) variables forms a training set that can be used to synthesize a model. In business environments optimization problems prevail, where the dependent variables represent yield, profit, or other continuous quantities, so the problem in question is regression, and the model built a regression model. Statistics and contemporary machine learning (ML) offers a broad spectrum of methods that can in principle implement arbitrarily complex models, deep neural networks being the most recent demonstrator of those capabilities (Bengio & Courville, 2016).

However, ML methods usually ignore the constraints associated with a given problem. Typically, a ML model is trained on data gathered from the process running within its operating conditions, and is then used to predict the dependent variable for various feasible combinations of the input variables that control the process. Such models are typically never (neither in training nor in testing) confronted with infeasible inputs and in this sense disregard the constraints that define the feasible region. Most ML models, when queried in the infeasible region, will produce some value of the dependent variable, which risks deluding decision makers that certain practically unrealizable combinations of input variables are feasible. Moreover, the quality of ML model’s response is likely to be low for infeasible examples, given that making accurate predictions for them requires extrapolation beyond the training data, which is much harder than interpolation within the training data. Constraints are thus essential for practical modeling, in particular for pushing the optimization to its limits and achieving a competitive advantage in real-world applications.

To address the demand for constraints in process modeling and optimization, we propose a novel method that automates the modeling of constraints. The input to the method is a set of examples that represents feasible and infeasible states of the process, obtained by, e.g., monitoring the process and labeling the states representing the normal operating conditions as feasible, and the states representing the failures or breakdowns as infeasible. Alternatively, examples can be generated by a computer simulation of the process, which could help generating infeasible examples that may occur rarely in reality.

Given the set of examples, the method formulates an appropriate mixed-integer linear programming (MILP) problem and solves it, producing constraints in symbolic, human-readable form of mathematical programming problem. The method, detailed in Section 3, is the core contribution of this study. In Section 2, we discuss related works. In Section 4 we scrupulously analyze the accuracy (fidelity with respect to the actual constraints) and complexity of the synthesized constraints, for three benchmarks of parameterized dimensionality. The experiments show that the method is in most cases capable of producing compact, transparent, and highly accurate constrains of analyzed processes. In Section 5, we approach a real-world problem of concrete manufacturing, and show that the synthesized constraints are not only accurate but can be also used to explain the complex characteristics of this process. Section 6 discusses the results and Section 7 concludes this paper.

Section snippets

State of the art

Studies on deriving constraints for optimization problems are few and far between, so the review of related works needs to reach out to quite distant areas.

The concept of learning constraints is relatively popular in the constraint programming community. For instance, Kolb, (2016) uses inductive logic programming to learn a set of first-order clauses from examples and then applies preference learning to estimate the weights of those clauses. The author validates the approach on simple

Constraint synthesis

Explanation of our method can be conveniently split into two components: representation of the constraint synthesis problem (Section 3.1) and the algorithm used to solve it (Section 3.2).

Experiment

We assess the proposed method on synthesizing constraint models of three types: n-dimensional balls, n-dimensional simplexes and n-dimensional cubes (Table 1), for n ∈ {3, 4, 5, 6, 7}. The benchmarks vary in characteristics:

  • Each Balln benchmark has only one constraint, a quadratic polynomial of all n+1 variables,

  • The Simplexn benchmark has n(n1) linear constraints of two variables each, and one constraint of all n variables,

  • The Cuben benchmark has 2n linear constraints of one variable each.

The

Case study: concrete slump test

In this section, we use the proposed method to synthesize constraints for the problem of fabricating concrete. Particular applications require concrete of different properties, e.g., roads require concrete resilient to sulphate and weak acids used in road salts during winter maintenance, while buildings require concrete with high thermal isolation properties. The properties of concrete depend on the proportions of ingredients, which are difficult to estimate in advance, but also the assessment

Discussion

The feature of the proposed approach that we find particularly appealing is its capability of synthesizing constraints of virtually any type. The linear, quadratic, and trigonometric terms used in this study are only few examples of constraints that can be handled. More sophisticated and highly nonlinear transformations can be applied as well. This extends the applicability of the method in real life, where nonlinear constraints are common.

Another advantage of the proposed approach is the

Conclusions

We presented a novel method to automate modeling of business processes and systems that is designed to lower a human effort of building a model. The method is fed with examples (states) of the process operating under feasible (normal, operational) and infeasible (failure, exceptional) conditions, and produces a well-formed MP problem suitable for interpretation, prediction and optimization, expressed with linear or nonlinear terms. Experimental evaluation proved plausible results in terms of

Acknowledgment

T. Pawlak was supported by the funds for statutory activity of Poznan University of Technology, Poland, grant no. 09/91/DSMK/0606. K. Krawiec was supported by National Science Centre, Poland, grant no. 2014/15/B/ST6/05205.

References (20)

  • T.P. Pawlak et al.

    Synthesis of mathematical programming constraints with genetic programming

    Proceedings of the EUROGP

    (2017)
  • A. Aswal et al.

    Estimating correlated constraint boundaries from timeseries data: The multi-dimensional german tank problem

    Proceedings of the Euro

    (2010)
  • N. Beldiceanu et al.

    A model seeker: Extracting global constraint models from positive examples

    Proceedings of the Cp

    (2012)
  • Bengio, I. G. Y., & Courville, A. (2016). Deep learning. Book in preparation for MIT...
  • C. Bessiere et al.

    Constraint acquisition via partial queries

    International joint conference on artificial intelligence

    (2013)
  • C. Bessiere et al.

    A sat-based version space algorithm for acquiring constraint satisfaction problems

  • C. Bessiere et al.

    Query-driven constraint acquisition

    Proceedings of the IJCAI

    (2007)
  • European Committee for Standardization (2000). European standard en...
  • P. Flach

    Machine learning: The art and science of algorithms that make sense of data

    (2012)
  • Gomory, R. E. (1960). An algorithm for the mixed integer problem. Technical Report. RM-2597-PR, RAND...
There are more references available in the full text version of this article.

Cited by (0)

View full text