Abstract
This paper discusses a new method to perform propagation over a (two-layer, feed-forward) Neural Network embedded in a Constraint Programming model. The method is meant to be employed in Empirical Model Learning, a technique designed to enable optimal decision making over systems that cannot be modeled via conventional declarative means. The key step in Empirical Model Learning is to embed a Machine Learning model into a combinatorial model. It has been showed that Neural Networks can be embedded in a Constraint Programming model by simply encoding each neuron as a global constraint, which is then propagated individually. Unfortunately, this decomposition approach may lead to weak bounds. To overcome such limitation, we propose a new network-level propagator based on a non-linear Lagrangian relaxation that is solved with a subgradient algorithm. The method proved capable of dramatically reducing the search tree size on a thermal-aware dispatching problem on multicore CPUs. The overhead for optimizing the Lagrangian multipliers is kept within a reasonable level via a few simple techniques. This paper is an extended version of [27], featuring an improved structure, a new filtering technique for the network inputs, a set of overhead reduction techniques, and a thorough experimentation.
Similar content being viewed by others
Notes
Of course the propagation is likely to be less effective for more complex networks.
Real-valued variables with fixed precision can be modeled via integer variables: e.g. a number in [0,1] with precision 0.01 corresponds to a number ∈{0..100}. This representation requires some care to ensure consistent rounding. More details can be found in [2].
Google OR-tools, at https://developers.google.com/optimization/
References
Audet, C. (2014). A survey on direct search methods for blackbox optimization and their applications. In Mathematics Without Boundaries (pp. 31–56): Springer.
Bartolini, A., Lombardi, M., Milano, M., & Benini, L. (2011). Neuron Constraints to Model Complex Real-World Problems. In Proc. of CP (pp. 115–129).
Bartolini, A., Lombardi, M., Milano, M., & Benini, L. (2012). Optimization and Controlled Systems: A Case Study on Thermal Aware Workload Dispatching. Proc. of AAAI.
Basheer, I.A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, computing, design, and application. Journal of Microbiological Methods, 43(1), 3–31.
Belew, R.K., McInerney, J., & Schraudolph, N.N. (1991). Evolving networks: Using the genetic algorithm with connectionist learning. Proc. of Artificial Life, 511–547.
Belotti, P., Lee, J., Liberti, L., Margot, F., & Wächter, A. (2009). Branching and bounds tightening techniques for non-convex MINLP. Optimization Methods and Software, 24(4-5), 597–634.
Bergman, D., Cirė, A. A., & van Hoeve, W.-J. (2015). Lagrangian bounds from decision diagrams. Constraints, 20(3), 346–361.
Bonfietti, A., & Lombardi, M. (2012). The weighted average constraint. In Proc. of CP (pp. 191–206): Springer.
Bonfietti, A., Lombardi, M., & Milano, M. (2015). Embedding decision trees and random forests in constraint programming. In Proc. of CPAIOR (pp. 74–90).
Cambazard, H., & Fages, J.-G. (2015). New filtering for atmostnvalue and its weighted variant: A lagrangian approach. Constraints, 20(3), 362–380.
Chow, T.T., Zhang, G.Q., Lin, Z., & Song, C.L. (2002). Global optimization of absorption chiller system by genetic algorithm and neural network. Energy and Buildings, 34(1), 103–109.
Conn, A.R., Scheinberg, K., & Vicente, L.N. (2009). Introduction To Derivative-free Optimization, volume 8. Siam.
d’Antonio, G., & Frangioni, A. (2009). Convergence analysis of deflected conditional approximate subgradient methods. SIAM Journal on Optimization, 20(1), 357–386.
Focacci, F., Lodi, A., & Milano, M. (1999). Cost-based domain filtering.
Ge, S.S., Hang, C.C., Lee, T.H., & Zhang, T. (2010). Stable adaptive neural network control. Springer Publishing Company, Incorporated.
Gent, I.P., Kotthoff, L., Miguel, I., & Nightingale, P. (2010). Machine learning for constraint solver design – A case study for the alldifferent constraint. CoRR, abs/1008.4326.
Glover, F., Kelly, J.P., & Laguna, M. (1999). New Advances for Wedding optimization and simulation. In Proc. of WSC. IEEE (pp. 255–260).
Gopalakrishnan, K., & Asce, A.M. (2009). Neural Network Swarm Intelligence Hybrid Nonlinear Optimization Algorithm for Pavement Moduli Back-Calculation. Journal of Transportation Engineering, 136(6), 528–536.
Gualandi, S., & Malucelli, F. (2012). Resource constrained shortest paths with a super additive objective function. In Proc. of CP (pp. 299–315): Springer.
Howard, J., Dighe, S., Vangal, S.R., Ruhl, G., Borkar, N., Jain, S., Erraguntla, V., Konow, M., Riepen, M., Gries, M., & et al. (2011). A 48-core ia-32 processor in 45 nm cmos using on-die message-passing and dvfs for performance and power scaling. IEEE Journal of Solid-State Circuits, 46(1), 173–183.
Huang, W., Ghosh, S., & Velusamy, S. (2006). HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Transactions on VLSI, 14 (5), 501–513.
Hutter, F., Hoos, H. H., Leyton-Brown, K., & Stu̇tzle, T. (2009). Paramils: An automatic algorithm configuration framework. Journal of Artificial Intelligence Research, 36, 267–306.
Jayaseelan, R., & Mitra, T. (2009). A hybrid local-global approach for multi-core thermal management. In Proc. of ICCAD (pp. 314–320): ACM Press.
Kiranyaz, S., Ince, T., Yildirim, A., & Gabbouj, M. (2009). Evolutionary artificial neural networks by multi-dimensional particle swarm optimization. Neural Networks, 22(10), 1448–1462.
Lemaréchal, C. (2001). Lagrangian relaxation. In Computational Combinatorial Optimization (pp. 112–156): Springer.
Ljung, L. (1999). System identification. Wiley Online Library.
Lombardi, M., & Gualandi, S. (2013). A new propagator for two-layer neural networks in empirical model learning. In Proc. of CP (pp. 448–463).
Montana, D.J., & Davis, L. (1989). Training feedforward neural networks using genetic algorithms. In Proc. of IJCAI (pp. 762–767).
Moore, J., Chase, J.S., & Ranganathan, P. (2006). Weatherman: Automated, Online and Predictive Thermal Mapping and Management for Data Centers. In Proc. of ICAC. IEEE (pp. 155–164).
Moré, J.J. (1978). The Levenberg-Marquardt algorithm: implementation and theory. In Numerical analysis (pp. 105–116): Springer.
Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., & Tucker, P.K. (2005). Surrogate-based analysis and optimization. Progress In Aerospace Sciences, 41(1), 1–28.
Sellmann, M. (2004). Theoretical foundations of cp-based lagrangian relaxation. In Proc. of CP (pp. 634–647): Springer.
Sellmann, M., & Fahle, T. (2003). Constraint programming based lagrangian relaxation for the automatic recording problem. Annals of Operations Research, 118 (1–4), 17–33.
Slusky, M.R., & van Hoeve, W.J. (2013). A lagrangian relaxation for golomb rulers. In Proc. of CPAIOR (pp. 251–267): Springer.
Van Cauwelaert, S., Lombardi, M., & Schaus, P. (2015). Understanding the potential of propagators. In Proc. of CPAIOR (pp. 427–436).
Zhang, G., Patuwo, B.E., & Hu, M.Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35–62.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lombardi, M., Gualandi, S. A lagrangian propagator for artificial neural networks in constraint programming. Constraints 21, 435–462 (2016). https://doi.org/10.1007/s10601-015-9234-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10601-015-9234-6