Skip to main content

Training Binarized Neural Networks Using MIP and CP

  • Conference paper
  • First Online:
Principles and Practice of Constraint Programming (CP 2019)

Abstract

Binarized Neural Networks (BNNs) are an important class of neural network characterized by weights and activations restricted to the set \(\{-1,+1\}\). BNNs provide simple compact descriptions and as such have a wide range of applications in low-power devices. In this paper, we investigate a model-based approach to training BNNs using constraint programming (CP), mixed-integer programming (MIP), and CP/MIP hybrids. We formulate the training problem as finding a set of weights that correctly classify the training set instances while optimizing objective functions that have been proposed in the literature as proxies for generalizability. Our experimental results on the MNIST digit recognition dataset suggest that—when training data is limited—the BNNs found by our hybrid approach generalize better than those obtained from a state-of-the-art gradient descent method. More broadly, this work enables the analysis of neural network performance based on the availability of optimal solutions and optimality bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Our source code is publicly available at https://bitbucket.org/RToroIcarte/bnn.

References

  1. Anderson, Ross, Huchette, Joey, Tjandraatmadja, Christian, Vielma, Juan Pablo: Strong Mixed-Integer Programming Formulations for Trained Neural Networks. In: Lodi, Andrea, Nagarajan, Viswanath (eds.) IPCO 2019. LNCS, vol. 11480, pp. 27–42. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17953-3_3

    Chapter  Google Scholar 

  2. Cheng, Chih-Hong, Nührenberg, Georg, Huang, Chung-Hao, Ruess, Harald: Verification of Binarized Neural Networks via Inter-neuron Factoring. In: Piskac, Ruzica, Rümmer, Philipp (eds.) VSTTE 2018. LNCS, vol. 11294, pp. 279–290. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03592-1_16

    Chapter  Google Scholar 

  3. Ching, T., et al.: Opportunities and obstacles for deep learning in biology and medicine. J. Roy. Soc. Interface 15(141), 20170387 (2018)

    Article  Google Scholar 

  4. Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)

    Article  Google Scholar 

  5. Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints 23, 296–309 (2018)

    Article  MathSciNet  Google Scholar 

  6. Gambella, C., Ghaddar, B., Naoum-Sawaya, J.: Optimization models for machine learning: a survey. arXiv preprint arXiv:1901.05331 (2019)

  7. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  8. Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2018). http://www.gurobi.com

  9. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Proceedings of the 29th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 4107–4115 (2016)

    Google Scholar 

  10. IBM: ILOG CP Optimizer 12.8 Manual (2018)

    Google Scholar 

  11. Jiang, Y., Krishnan, D., Mobahi, H., Bengio, S.: Predicting the generalization gap in deep networks with margin distributions. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  12. Kawaguchi, K., Kaelbling, L.P., Bengio, Y.: Generalization in deep learning. arXiv preprint arXiv:1710.05468 (2017)

  13. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. In: Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  14. Khalil, E.B., Dilkina, B.: Training binary neural networks with combinatorial algorithms. In: Extended abstract at the 15th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR) (2018)

    Google Scholar 

  15. Khalil, E.B., Gupta, A., Dilkina, B.: Combinatorial attacks on binarized neural networks. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  17. Lahoud, F., Achanta, R., Márquez-Neila, P., Süsstrunk, S.: Self-binarizing networks. arXiv preprint arXiv:1902.00730 (2019)

  18. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  19. LeCun, Y., Cortes, C., Burges, C.J.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist

  20. Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016)

  21. Miotto, R., Wang, F., Wang, S., Jiang, X., Dudley, J.T.: Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19(6), 1236–1246 (2017)

    Article  Google Scholar 

  22. Mishra, A., Marr, D.: Apprentice: using knowledge distillation techniques to improve low-precision network accuracy. In: Proceedings of the 6th International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  23. Moody, J.E.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Proceedings of the 4th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 847–854 (1991)

    Google Scholar 

  24. Narodytska, N.: Formal analysis of deep binarized neural networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 5692–5696 (2018)

    Google Scholar 

  25. Neyshabur, B., Bhojanapalli, S., McAllester, D., Srebro, N.: Exploring generalization in deep learning. In: Proceedings of the 30th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 5947–5956 (2017)

    Google Scholar 

  26. Rastegari, Mohammad, Ordonez, Vicente, Redmon, Joseph, Farhadi, Ali: XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  27. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    Article  Google Scholar 

  28. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  29. Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  Google Scholar 

  30. Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  31. Umuroglu, Y., et al.: FINN: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 25th International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 65–74 (2017)

    Google Scholar 

  32. Vanschoren, J.: Meta-learning: a survey. arXiv preprint arXiv:1810.03548 (2018)

  33. Wan, D., et al.: TBN: convolutional neural network with ternary inputs and binary weights. In: Proceedings of the 15th European Conference on Computer Vision (ECCV), pp. 315–332 (2018)

    Chapter  Google Scholar 

Download references

Acknowledgements

We would like to thank Toryn Klassen, Maayan Shvo, and Ethan Waldie for their help running experiments and Kyle Booth, Arik Senderovich, and the anonymous reviewers for helpful comments. We gratefully acknowledge funding from CONICYT (Becas Chile), NSERC, and Microsoft Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo Toro Icarte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Toro Icarte, R., Illanes, L., Castro, M.P., Cire, A.A., McIlraith, S.A., Beck, J.C. (2019). Training Binarized Neural Networks Using MIP and CP. In: Schiex, T., de Givry, S. (eds) Principles and Practice of Constraint Programming. CP 2019. Lecture Notes in Computer Science(), vol 11802. Springer, Cham. https://doi.org/10.1007/978-3-030-30048-7_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30048-7_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30047-0

  • Online ISBN: 978-3-030-30048-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics