Skip to main content

Advertisement

Log in

Rectified Binary Convolutional Networks with Generative Adversarial Learning

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Binarized convolutional neural networks (BNNs) are widely used to improve the memory and computational efficiency of deep convolutional neural networks for to be employed on embedded devices. However, existing BNNs fail to explore their corresponding full-precision models’ potential, resulting in a significant performance gap. This paper introduces a Rectified Binary Convolutional Network (RBCN) by combining full precision kernels and feature maps to rectify the binarization process in a generative adversarial network (GAN) framework. We further prune our RBCNs using the GAN framework to increase the model efficiency and promote flexibly in practical applications. Extensive experiments validate the superior performance of the proposed RBCN over state-of-the-art BNNs on tasks such as object classification, object tracking, face recognition, and person re-identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. In this paper, the terms “filter” and “kernel” are exchangeable.

References

  • Ahn, S., Hu, S. X., Damianou, A., Lawrence, N. D., & Dai, Z. (2019). Variational information distillation for knowledge transfer. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9163–9171).

  • Arjovsky, M., Chintala, S., & Bottou, L. (2017, August). Wasserstein generative adversarial networks. In Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 214–223).

  • Belagiannis, V., Farshad, A., & Galasso, F. (2018). Adversarial network compression. In Proceedings of the European conference on computer vision (ECCV).

  • Cai, H., Zhu, L., & Han, S. (2018). Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332.

  • Changyong, S., Peng, L., Yuan, X., Yanyun, Q., Longquan, D., & Lizhuang, M. (2019). Knowledge squeezed adversarial network compression. arXiv preprint arXiv:1904.0510.

  • Chen, H., Lian Zhuo, B. Z., Zheng, X., Liu, J., Doermann, D., & Ji, R. (2020). Binarized neural architecture search. Identity, 2, 3.

    Google Scholar 

  • Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1251–1258).

  • Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks: Training deep neural networks with weights and activations constrained to \(+1\) or \(-1\). arXiv preprint arXiv:1602.02830.

  • Denton, E., Zaremba, W., Bruna, J., Lecun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. arXiv preprint arXiv:1404.0736.

  • Gao, M., Shen, Y., Li, Q., & Loy, C. C. (2020). Residual knowledge distillation. arXiv preprint arXiv:2002.09168.

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).

  • Gu, J., Zhang, B., & Liu, J. (2019). Projection convolutional neural networks. In AAAI.

  • Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of Wasserstein GANs. In Advances in neural information processing systems (pp. 5767–5777).

  • Guo, Y., Yao, A., & Chen, Y. (2016). Dynamic network surgery for efficient DNNs. In Advances in neural information processing systems (pp. 1379–1387).

  • Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135–1143).

  • Hassibi, B., & Stork, D. G. (1993). Second order derivatives for network pruning: Optimal brain surgeon. In Advances in neural information processing systems (pp. 164–171).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition (pp. 770–778).

  • He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018). Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866.

  • He, Y., Liu, P., Wang, Z., Hu, Z., & Yang, Y. (2020). Filter pruning via geometric median for deep convolutional neural networks acceleration. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR).

  • He, Y., Zhang, X., & Sun, J. (2017). Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE international conference on computer vision (pp. 1389–1397).

  • Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

  • Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49. Amherst: University of Massachusetts.

  • Huang, L., Zhao, X., & Huang, K. (2018). Got-10k: A large high-diversity benchmark for generic object tracking in the wild. arXiv preprint arXiv:1810.11981.

  • Huang, Z., & Wang, N. (2018). Data-driven sparse structure selection for deep neural networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 304–320).

  • Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with \(50\times \) fewer parameters and \(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360.

  • Jaderberg, M., Vedaldi, A., & Zisserman, A. (2014). Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866.

  • Krizhevsky, N. (2009). Hinton: The cifar-10 dataset. Online http://www.cs.toronto.edu/kriz/cifar.html.

  • Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., & Lempitsky, V. (2014). Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553.

  • Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2016). Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710.

  • Lin, S., Ji, R., Li, Y., Wu, Y., Huang, F., & Zhang, B. (2018). Accelerating convolutional networks via global & dynamic filter pruning. In IJCAI (pp. 2425–2432).

  • Lin, S., Ji, R., Yan, C., Zhang, B., Cao, L., Ye, Q., Huang, F., & Doermann, D. (2019). Towards optimal structured CNN pruning via generative adversarial learning. In Proceedings of CVPR (pp. 2790–2799).

  • Lin, X., Zhao, C., & Pan, W. (2017). Towards accurate binary convolutional neural network. In Advances in neural information processing systems (pp. 345–353).

  • Liu, C., Ding, W., Xia, X., Hu, Y., Zhang, B., Liu, J., Zhuang, B., & Guo, G. (2019). RBCN: Rectified binary convolutional networks for enhancing the performance of 1-bit DCNNs. In International joint conference on artificial intelligence.

  • Liu, Z., Shen, Z., Savvides, M., & Cheng, K. T. (2020). Reactnet: Towards precise binary neural network with generalized activation functions. arXiv preprint arXiv:2003.03488.

  • Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., & Cheng, K. T. (2018). Bi-real net: Enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In Proceedings of the European conference on computer vision (pp. 722–737).

  • Li, Y., Lin, S., Zhang, B., Liu, J., Doermann, D., Wu, Y., Huang, F., & Ji, R. (2019). Exploiting kernel sparsity and entropy for interpretable CNN compression. In Proceedings of CVPR (pp. 2800–2809).

  • Li, Z., Ni, B., Zhang, W., Yang, X., & Gao, W. (2017). Performance guaranteed network acceleration via high-order residual quantization. In Proceedings of the IEEE international conference on computer vision (pp. 2584–2592).

  • Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Paul Smolley, S. (2017). Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2794–2802).

  • Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2016). Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440.

  • Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., & Zafeiriou, S. (2017). Agedb: The first manually collected, in-the-wild age database. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 51–59).

  • Mueller, M., Smith, N., & Ghanem, B. (2016). A benchmark and simulator for UAV tracking. In European conference on computer vision (pp. 445–461).

  • Odena, A., Olah, C., & Shlens, J. (2017). Conditional image synthesis with auxiliary classifier GANs. In Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 2642–2651).

  • Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). XNOR-net: ImageNet classification using binary convolutional neural networks. In European conference on computer vision.

  • Rigamonti, R., Sironi, A., Lepetit, V., & Fua, P. (2013). Learning separable filters. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2754–2761).

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  • Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training GANs. In Advances in neural information processing systems (pp. 2234–2242).

  • Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510–4520).

  • Sengupta, S., Chen, J. C., Castillo, C., Patel, V. M., Chellappa, R., & Jacobs, D. W. (2016). Frontal to profile face verification in the wild. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1–9).

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).

  • Tan, M., & Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946.

  • Tschannen, M., Agustsson, E., & Lucic, M. (2018). Deep generative models for distribution-preserving lossy compression. In Advances in neural information processing systems (pp. 5933–5944).

  • Wang, T., Gong, S., Zhu, X., & Wang, S. (2014). Person re-identification by video ranking. In European conference on computer vision (pp. 688–703).

  • Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2411–2418).

  • Wu, Y., Lim, J., & Yang, M. H. (2015). Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1834–1848.

    Article  Google Scholar 

  • Xu, Z., Hsu, Y. C., & Huang, J. (2018). Training student networks for acceleration with conditional adversarial networks. In BMVC (p. 61).

  • Yang, T. J., Chen, Y. H., & Sze, V. (2017). Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of CVPR (pp. 5687–5695).

  • Yang, Z., Moczulski, M., Denil, M., de Freitas, N., Smola, A., Song, L., & Wang, Z. (2015). Deep fried convnets. In Proceedings of the IEEE international conference on computer vision (pp. 1476–1483).

  • Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923.

  • Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.

  • Zhang, D., Yang, J., Ye, D., & Hua, G. (2018). LQ-nets: Learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 365–382).

  • Zhang, J., Pan, Y., Yao, T., Zhao, H., & Mei, T. (2019). DABNN: A super fast inference framework for binary neural networks on arm devices. In Proceedings of the 27th ACM international conference on multimedia (pp. 2272–2275).

  • Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6848–6856).

  • Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: A benchmark. In Proceedings of the IEEE international conference on computer vision (pp. 1116–1124).

  • Zheng, Z., Zheng, L., & Yang, Y. (2017). Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. CoRR arXiv:1701.07717.

  • Zhou, A., Yao, A., Guo, Y., Xu, L., & Chen, Y. (2017). Incremental network quantization: Towards lossless CNNs with low-precision weights. arXiv preprint arXiv:1702.03044.

  • Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., & Zou, Y. (2016). Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160.

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China U20B2042, National Natural Science Foundation of China 62076019, and Science and Technology Innovation 2030-Key Project of “New Generation Artificial Intelligence” under Grant 2020AAA0108200. The work was supported in part by National Natural Science Foundation of China under Grants 62076016 and 61672079. This work is also supported by Shenzhen Science and Technology Program KQTD2016112515134654. Chunlei Liu and Wenrui Ding contribute equally. Baochang Zhang is the corresponding author who is also with Shenzhen Academy of Aerospace Technology, Shenzhen, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baochang Zhang.

Additional information

Communicated by Mei Chen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, C., Ding, W., Hu, Y. et al. Rectified Binary Convolutional Networks with Generative Adversarial Learning. Int J Comput Vis 129, 998–1012 (2021). https://doi.org/10.1007/s11263-020-01417-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-020-01417-9

Keywords

Navigation