Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network

Han, Dong; Zhou, Shengyuan; Zhi, Tian; Wang, Yibo; Liu, Shaoli

doi:10.1007/s10766-018-00626-7

Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network

Published: 29 May 2019

Volume 47, pages 345–359, (2019)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Dong Han^1,2,
Shengyuan Zhou^1,2,
Tian Zhi¹,
Yibo Wang⁴ &
…
Shaoli Liu^1,3

264 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

Recent years, as deep learning rose in prominence, neural network accelerators boomed. The existing research shows that both speed and energy-efficiency can be improved by low precision data structure. However, decreasing the precision of data might compromise the usefulness and accuracy of the underlying AI. And the existing studies can not meet all AI application requirements. In the paper, we propose a new data type, called Float-Fix (FF). We introduce the structure of FF and compare it with other data types. In our evaluation, the accuracy loss of 8-bit FF is less than 0.12% on a subset of known neural network models, 7\(\times \) better than fixed-point, DFX and floating-point on average. We implement the hardware architectures of operators and neural processing unit using 8-bit FF data type with TSMC 65 nm Gplus High VT library. The experiments show that the hardware cost of convertors converting between 16-bit fixed-point and FF is really small. And the multiplier of 8-bit FF only needs 1188 \(\upmu \mathrm{m}^2\) area, which is nearly 8-bit fixed-point. Comparing with the neural processing unit of DianNao, FF reduces 34.3% area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Towards a Better 16-Bit Number Representation for Training Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Article 11 January 2021

Dissecting FLOPs Along Input Dimensions for GreenAI Cost Estimations

References

Albericio, J., Judd, P., Hetherington, T., Aamodt, T., Jerger, N.E., Moshovos, A.: Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International Symposium on Computer Architecture, pp. 1–13 (2016)
Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O.: Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 269–284. ACM (2014)
Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2014)
Courbariaux, M., Hubara, I., Soudry, D., et al.: Binarized neural networks: training deep neural networks with weights and activations constrained to \(+\)1 or \(-\)1 (2016). arXiv:1602.02830
Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations, pp. 3123–3131 (2015)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, 248–255 (2009)
Google Scholar
Dettmers, T.: 8-bit approximations for parallelism in deep learning (2016)
Ding, C., Liao, S., Wang, Y., Li, Z., Liu, N., Zhuo, Y., Wang, C., Qian, X., Bai, Y., Yuan, G.: Circnn: accelerating and compressing deep neural networks using block-circulant weight matrices (2017)
Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O.: Shidiannao. ACM SIGARCH Comput. Architect. News 43(3), 92–104 (2015)
Article Google Scholar
Du, Z., Lingamneni, A., Chen, Y., Palem, K., Temam, O., Wu, C.: Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: Design Automation Conference, pp. 201–206 (2014)
Ewe, C.T., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation, pp. 200–208 (2004)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. Fiber 56(4), 3–7 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, pp. 770–778 (2015)
Hubara, I., Courbariaux, M., Soudry, D., et al.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)
MathSciNet MATH Google Scholar
Jouppi, N.P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers, A., et al.: In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, pp. 1–12. ACM (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)
Köster, U., Webb, T.J., Wang, X., Nassar, M., Bansal, A.K., Constable, W.H., Elibol, O.H., Gray, S., Hall, S., Hornof, L.: Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (2017)
Lin, D.D., Talathi, S.S., Annapureddy, V.S., et al.: Fixed point quantization of deep convolutional networks. In: International Conference on Machine Learning, pp. 2849–2858 (2016)
Liu, D., Chen, T., Liu, S., Zhou, J., Zhou, S., Teman, O., Feng, X., Zhou, X., Chen, Y.: Pudiannao: a polyvalent machine learning accelerator. In: Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 369–381 (2015)
Liu, S., Du, Z., Tao, J., Han, D., Luo, T., Xie, Y., Chen, Y., Chen, T.: Cambricon: an instruction set architecture for neural networks. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 393–405. IEEE Press (2016)
Luo, T., Luo, T., Liu, S., He, L., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2015)
Mellempudi, N., Kundu, A., Das, D., Mudigere, D., Kaul, B.: Mixed low-precision deep learning inference using dynamic fixed point (2017)
Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation (2016)
Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S.W., Dally, W.J.: Scnn: an accelerator for compressed-sparse convolutional neural networks, pp. 27–40 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)
Te Ewe, C., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation. In: International Conference on Field Programmable Logic and Applications, pp. 200–208 (2004)
Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUS. In: Deep Learning and Unsupervised Feature Learning Workshop, NIPS (2011)
Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., Chen, Y.: Cambricon-x: an accelerator for sparse neural networks. In: IEEE/ACM International Symposium on Microarchitecture, pp. 1–12 (2016)
Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: towards lossless CNNS with low-precision weights (2017)

Download references

Acknowledgements

This work is partially supported by the National Key Research and Development Program of China (under Grant 2017YFA0700902, 2017YFB1003101), the NSF of China (under Grants 6147239, 61432016, 61473275, 61522211, 61532016, 61521092, 61502446, 61672491, 61602441, 61602446, 61732002, 61702478), the 973 Program of China (under Grant 2015CB358800), National Science and Technology Major Project (2018ZX01031102) and Strategic Priority Research Program of Chinese Academy of Sciences (XDBS01050200).

Author information

Authors and Affiliations

Intelligent Processor Research Center, The Institute of Computing Technology, The Chinese Academy of Sciences, Beijing, China
Dong Han, Shengyuan Zhou, Tian Zhi & Shaoli Liu
University of Chinese Academy of Sciences, Beijing, China
Dong Han & Shengyuan Zhou
Cambricon Tech. Ltd., Beijing, China
Shaoli Liu
Tsinghua University, Beijing, China
Yibo Wang

Authors

Dong Han
View author publications
You can also search for this author in PubMed Google Scholar
Shengyuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Tian Zhi
View author publications
You can also search for this author in PubMed Google Scholar
Yibo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shaoli Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaoli Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, D., Zhou, S., Zhi, T. et al. Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network. Int J Parallel Prog 47, 345–359 (2019). https://doi.org/10.1007/s10766-018-00626-7

Download citation

Received: 15 November 2018
Accepted: 24 December 2018
Published: 29 May 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s10766-018-00626-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network

Abstract

Access this article

Similar content being viewed by others

Towards a Better 16-Bit Number Representation for Training Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Dissecting FLOPs Along Input Dimensions for GreenAI Cost Estimations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network

Abstract

Access this article

Similar content being viewed by others

Towards a Better 16-Bit Number Representation for Training Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Dissecting FLOPs Along Input Dimensions for GreenAI Cost Estimations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation