Efficient Fixed/Floating-Point Merged Mixed-Precision Multiply-Accumulate Unit for Deep Learning Processors | IEEE Conference Publication | IEEE Xplore

Efficient Fixed/Floating-Point Merged Mixed-Precision Multiply-Accumulate Unit for Deep Learning Processors


Abstract:

Deep learning is getting more and more attentions in recent years. Many hardware architectures have been proposed for efficient implementation of deep neural network. The...Show More

Abstract:

Deep learning is getting more and more attentions in recent years. Many hardware architectures have been proposed for efficient implementation of deep neural network. The arithmetic unit, as a core processing part of the hardware architecture, can determine the functionality of the whole architecture. In this paper, an efficient fixed/floating-point merged multiply-accumulate unit for deep learning processor is proposed. The proposed architecture supports 16-bit half-precision floating-point multiplication with 32-bit single-precision accumulation for training operations of deep learning algorithm. In addition, within the same hardware, the proposed architecture also supports two parallel 8-bit fixed-point multiplications and accumulating the products to 32-bit fixed-point number. This will enable higher throughput for inference operations of deep learning algorithms. Compared to a half-precision multiply-accumulate unit (accumulating to single-precision), the proposed architecture has only 4.6% area overhead. With the proposed multiply-accumulate unit, the deep learning processor can support both training and high-throughput inference.
Date of Conference: 27-30 May 2018
Date Added to IEEE Xplore: 04 May 2018
ISBN Information:
Electronic ISSN: 2379-447X
Conference Location: Florence, Italy

Contact IEEE to Subscribe

References

References is not available for this document.