Training Binarized Neural Networks Using MIP and CP

Toro Icarte, Rodrigo; Illanes, León; Castro, Margarita P.; Cire, Andre A.; McIlraith, Sheila A.; Beck, J. Christopher

doi:10.1007/978-3-030-30048-7_24

Rodrigo Toro Icarte^10,11,
León Illanes¹⁰,
Margarita P. Castro¹²,
Andre A. Cire¹³,
Sheila A. McIlraith^10,11 &
…
J. Christopher Beck¹²

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11802))

Included in the following conference series:

International Conference on Principles and Practice of Constraint Programming

1608 Accesses
6 Citations

Abstract

Binarized Neural Networks (BNNs) are an important class of neural network characterized by weights and activations restricted to the set \(\{-1,+1\}\). BNNs provide simple compact descriptions and as such have a wide range of applications in low-power devices. In this paper, we investigate a model-based approach to training BNNs using constraint programming (CP), mixed-integer programming (MIP), and CP/MIP hybrids. We formulate the training problem as finding a set of weights that correctly classify the training set instances while optimizing objective functions that have been proposed in the literature as proxies for generalizability. Our experimental results on the MNIST digit recognition dataset suggest that—when training data is limited—the BNNs found by our hybrid approach generalize better than those obtained from a state-of-the-art gradient descent method. More broadly, this work enables the analysis of neural network performance based on the availability of optimal solutions and optimality bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Our source code is publicly available at https://bitbucket.org/RToroIcarte/bnn.

References

Anderson, Ross, Huchette, Joey, Tjandraatmadja, Christian, Vielma, Juan Pablo: Strong Mixed-Integer Programming Formulations for Trained Neural Networks. In: Lodi, Andrea, Nagarajan, Viswanath (eds.) IPCO 2019. LNCS, vol. 11480, pp. 27–42. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17953-3_3
Chapter Google Scholar
Cheng, Chih-Hong, Nührenberg, Georg, Huang, Chung-Hao, Ruess, Harald: Verification of Binarized Neural Networks via Inter-neuron Factoring. In: Piskac, Ruzica, Rümmer, Philipp (eds.) VSTTE 2018. LNCS, vol. 11294, pp. 279–290. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03592-1_16
Chapter Google Scholar
Ching, T., et al.: Opportunities and obstacles for deep learning in biology and medicine. J. Roy. Soc. Interface 15(141), 20170387 (2018)
Article Google Scholar
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)
Article Google Scholar
Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints 23, 296–309 (2018)
Article MathSciNet Google Scholar
Gambella, C., Ghaddar, B., Naoum-Sawaya, J.: Optimization models for machine learning: a survey. arXiv preprint arXiv:1901.05331 (2019)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2018). http://www.gurobi.com
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Proceedings of the 29th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 4107–4115 (2016)
Google Scholar
IBM: ILOG CP Optimizer 12.8 Manual (2018)
Google Scholar
Jiang, Y., Krishnan, D., Mobahi, H., Bengio, S.: Predicting the generalization gap in deep networks with margin distributions. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Kawaguchi, K., Kaelbling, L.P., Bengio, Y.: Generalization in deep learning. arXiv preprint arXiv:1710.05468 (2017)
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Khalil, E.B., Dilkina, B.: Training binary neural networks with combinatorial algorithms. In: Extended abstract at the 15th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR) (2018)
Google Scholar
Khalil, E.B., Gupta, A., Dilkina, B.: Combinatorial attacks on binarized neural networks. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Lahoud, F., Achanta, R., Márquez-Neila, P., Süsstrunk, S.: Self-binarizing networks. arXiv preprint arXiv:1902.00730 (2019)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
LeCun, Y., Cortes, C., Burges, C.J.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist
Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016)
Miotto, R., Wang, F., Wang, S., Jiang, X., Dudley, J.T.: Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19(6), 1236–1246 (2017)
Article Google Scholar
Mishra, A., Marr, D.: Apprentice: using knowledge distillation techniques to improve low-precision network accuracy. In: Proceedings of the 6th International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Moody, J.E.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Proceedings of the 4th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 847–854 (1991)
Google Scholar
Narodytska, N.: Formal analysis of deep binarized neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 5692–5696 (2018)
Google Scholar
Neyshabur, B., Bhojanapalli, S., McAllester, D., Srebro, N.: Exploring generalization in deep learning. In: Proceedings of the 30th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 5947–5956 (2017)
Google Scholar
Rastegari, Mohammad, Ordonez, Vicente, Redmon, Joseph, Farhadi, Ali: XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Article Google Scholar
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Umuroglu, Y., et al.: FINN: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 25th International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 65–74 (2017)
Google Scholar
Vanschoren, J.: Meta-learning: a survey. arXiv preprint arXiv:1810.03548 (2018)
Wan, D., et al.: TBN: convolutional neural network with ternary inputs and binary weights. In: Proceedings of the 15th European Conference on Computer Vision (ECCV), pp. 315–332 (2018)
Chapter Google Scholar

Download references

Acknowledgements

We would like to thank Toryn Klassen, Maayan Shvo, and Ethan Waldie for their help running experiments and Kyle Booth, Arik Senderovich, and the anonymous reviewers for helpful comments. We gratefully acknowledge funding from CONICYT (Becas Chile), NSERC, and Microsoft Research.

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Toronto, Canada
Rodrigo Toro Icarte, León Illanes & Sheila A. McIlraith
Vector Institute, Toronto, Canada
Rodrigo Toro Icarte & Sheila A. McIlraith
Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada
Margarita P. Castro & J. Christopher Beck
Department of Management, University of Toronto Scarborough, Toronto, Canada
Andre A. Cire

Authors

Rodrigo Toro Icarte
View author publications
You can also search for this author in PubMed Google Scholar
León Illanes
View author publications
You can also search for this author in PubMed Google Scholar
Margarita P. Castro
View author publications
You can also search for this author in PubMed Google Scholar
Andre A. Cire
View author publications
You can also search for this author in PubMed Google Scholar
Sheila A. McIlraith
View author publications
You can also search for this author in PubMed Google Scholar
J. Christopher Beck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Toro Icarte .

Editor information

Editors and Affiliations

INRA, Castanet Tolosan, France
Thomas Schiex
INRA, Castanet Tolosan, France
Simon de Givry

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Toro Icarte, R., Illanes, L., Castro, M.P., Cire, A.A., McIlraith, S.A., Beck, J.C. (2019). Training Binarized Neural Networks Using MIP and CP. In: Schiex, T., de Givry, S. (eds) Principles and Practice of Constraint Programming. CP 2019. Lecture Notes in Computer Science(), vol 11802. Springer, Cham. https://doi.org/10.1007/978-3-030-30048-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-30048-7_24
Published: 23 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30047-0
Online ISBN: 978-3-030-30048-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics