Skip to main content

Advertisement

Log in

Base-Reconfigurable Segmented Logarithmic Quantization and Hardware Design for Deep Neural Networks

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

The growth in the size of deep neural network (DNN) models poses both computational and memory challenges to the efficient and effective implementation of DNNs on platforms with limited hardware resources. Our work on segmented logarithmic (SegLog) quantization, adopting both base-2 and base-\(\sqrt {2}\) logarithmic encoding, is able to reduce inference cost with a little accuracy penalty. However, weight distribution varies among layers in different DNN models, and requires different base-2 : base-\(\sqrt {2}\) ratios to reach the best accuracy. This means different hardware designs for the decoding and computing parts are required. This paper extends the idea of SegLog quantization by using layer-wise base-2 : base-\(\sqrt {2}\) ratio on weight quantization. The proposed base-reconfigurable segmented logarithmic (BRSLog) quantization is able to achieve 6.4x weight compression with 1.66% Top-5 accuracy drop on AlexNet at 5-bit resolution. An arithmetic element supporting BRSLog-quantified DNN inference is proposed to adapt to different base-2 : base-\(\sqrt {2}\) ratios. With \(\sqrt {2}\) approximation, the resource-consuming multipliers can be replaced by shifters and adders with only 0.54% accuracy penalty. The proposed arithmetic element is simulated in UMC 55nm Low Power Process, and it is 50.42% smaller in area and 55.60% lower in power consumption than the widely-used 16-bit fixed-point multiplier. Compared with equivalent SegLog arithmetic element designed for fixed base-2 : base-\(\sqrt {2}\) ratio, the base-reconfigurable part only increases the area by 22.96 μm2 and energy cost by 2.6 μW.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Similar content being viewed by others

References

  1. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778).

  2. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444.

    Article  Google Scholar 

  3. Han, S., Mao, H., & Dally, W.J. (2015). Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. arXiv:abs/1510.00149.

  4. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks:, Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv:1602.02830.

  5. Lee, E.H., Miyashita, D., Chai, E., Murmann, B., & Wong, S.S. (2017). LogNet: Energy-efficient neural networks using logarithmic computation. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5900–5904).

  6. Xu, J., Huan, Y., Zheng, L., & Zou, Z. (2018). A Low-Power Arithmetic Element for Multi-Base Logarithmic Computation on Deep Neural Networks. In 2018 31st IEEE International System-on-Chip Conference (SOCC) (pp. 43–48).

  7. Jafri, S.M.A.H., Hemani, A., Paul, K., & Abbas, N. (2017). MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 276–286).

  8. Zhou, A., Yao, A., Guo, Y., Xu, L., & Chen, Y. (2017). Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights. arXiv:abs/1702.03044.

  9. Jung, S., Son, C., Lee, S., Son, J., Han, J. -J., Kwak, Y., Hwang, S.J., & Choi, C. (June 2019). Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss.

  10. Wang, K., Liu, Z., Lin, Y., Lin, J., & Han, S. (June 2019). HAQ: Hardware-Aware Automated Quantization With Mixed Precision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  11. Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., & Le, Q.V. (2019). MnasNet: Platform-Aware Neural Architecture Search for Mobile. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  12. Miyashita, D., Lee, E.H., & Murmann, B. (2016). Convolutional neural networks using logarithmic data representation. arXiv:1603.01025.

  13. Chen, Y., Krishna, T., Emer, J.S., & Sze, V. (2017). Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits, 52, 127–138.

    Article  Google Scholar 

  14. Shin, D., Lee, J., Lee, J., & Yoo, H. (2017). DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In 2017 IEEE International Solid-State Circuits Conference (ISSCC) (pp. 240–241).

  15. Lee, J., Lee, J., Han, D., Lee, J., Park, G., & Yoo, H. (2019). LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC) (pp. 142–144).

  16. Song, J., Cho, Y., Park, J., Jang, J., Lee, S., Song, J., Lee, J., & Kang, I. (2019). An 11.5TOPS/W 1024-MAC Butterfly Structure Dual-Core Sparsity-Aware Neural Processing Unit in 8nm Flagship Mobile SoC. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC) (pp. 130–132).

  17. Stathis, D., Yang, Y., Tewari, S., Hemani, A., Paul, K., Grabherr, M., & Ahmad, R. (2019). Approximate Computing Applied to Bacterial Genome Identification using Self-Organizing Maps. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (pp. 560–567).

  18. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M.A., & Dally, W.J. (2016). EIE: Efficient Inference Engine on Compressed Deep Neural Network. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (pp. 243–254).

  19. Ando, K., Ueyoshi, K., Orimo, K., Yonekawa, H., Sato, S., Nakahara, H., Takamaeda-Yamazaki, S., Ikebe, M., Asai, T., Kuroda, T., & Motomura, M. (2018). BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W. IEEE Journal of Solid-State Circuits, 53, 983–994.

    Article  Google Scholar 

  20. Ueyoshi, K., Ando, K., Hirose, K., Takamaeda-Yamazaki, S., Kadomoto, J., Miyata, T., Hamada, M., Kuroda, T., & Motomura, M. (2018). QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS. In 2018 IEEE International Solid - State Circuits Conference - (ISSCC) (pp. 216–218).

  21. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems (pp. 1135–1143).

  22. Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, 1, 1097–1105.

    Google Scholar 

  23. Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks For Large-Scale Image Recognition. arXiv:1409.1556.

  24. Huang, G., Liu, Z., Maaten, L.V.D., & Weinberger, K.Q. (2017). Densely Connected Convolutional Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2261–2269).

  25. Jin, Y., Xu, J., Huan, Y., Yan, Y., Zheng, L., & Zou, Z. (2019). Energy-Aware Workload Allocation for Distributed Deep Neural Networks in Edge-Cloud Continuum. In 2019 32st IEEE international system-on-chip conference (SOCC).

  26. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., & Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  27. Vedaldi, A., & Lenc, K. (2015). MatConvNet: Convolutional Neural Networks for MATLAB. In Proceedings of the 23rd ACM international conference on multimedia (pp. 689–692).

Download references

Acknowledgments

This work was supported in part by NSFC grants No. 61876039 and No. 62011530132, Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01) and ZJ Lab, and the Shanghai Platform for Neuromorphic and AI Chip (No. 17DZ2260900).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Li-Rong Zheng or Zhuo Zou.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jiawei Xu and Yuxiang Huan contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, J., Huan, Y., Jin, Y. et al. Base-Reconfigurable Segmented Logarithmic Quantization and Hardware Design for Deep Neural Networks. J Sign Process Syst 92, 1263–1276 (2020). https://doi.org/10.1007/s11265-020-01557-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-020-01557-8

Keywords

Navigation