Conferences >2020 IEEE International Solid...

7.2 A 12nm Programmable Convolution-Efficient Neural-Processing-Unit Chip Achieving 825TOPS

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Convolutional neural networks (CNN) represent a key application in data centers, which calls for accelerators that are: 1) efficient for CNN computations; 2) having high ...Show More

Metadata

Abstract:

Convolutional neural networks (CNN) represent a key application in data centers, which calls for accelerators that are: 1) efficient for CNN computations; 2) having high throughput to be cost-efficient; and, 3) with adequate programming flexibility for algorithm upgrades. Lacking of the availability of such a chip in the market, we designed our own. Matrix multiplication (MM) and convolution (CONV) are the top-2 deep learning (DL) operations requiring intensive computation. Most existing accelerators, like GPUs [6], [7], TPU [9], and a few new AI chips [3], [4] are architected for GEMM. Computing CONV on a GEMM engine, one needs the img2col() transformation to flatten images into general matrixes. This introduces huge data inflation, leading to unnecessary extra computation and storage, but also decreasing arithmetic intensity and bounding performance towards I/O and memory. Although some accelerators such as [5] exploit the CONV architecture directly, integrating larger but balanced computing power into a single chip is quite challenging. Moreover, with the fast evolution of DL algorithms, it is critical to design a programmable neural processing unit (NPU) instead of a dedicated ASIC for data center scenarios. To satisfy the above requirements, our NPU is architected to be CONV-efficient under the control of operation-fused coarse-grained instructions. It integrates as much computing power as possible via squeezed computation with a large SRAM-only design. Also, it delivers programming flexibility via an instruction set architecture (ISA) with coverage for anticipated forward-looking functionality.

Published in: 2020 IEEE International Solid-State Circuits Conference - (ISSCC)

Date of Conference: 16-20 February 2020

Date Added to IEEE Xplore: 13 April 2020

ISBN Information:

ISSN Information:

DOI: 10.1109/ISSCC19947.2020.9062984

Conference Location: San Francisco, CA, USA

Contents

References is not available for this document.

7.2 A 12nm Programmable Convolution-Efficient Neural-Processing-Unit Chip Achieving 825TOPS

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

7.2 A 12nm Programmable Convolution-Efficient Neural-Processing-Unit Chip Achieving 825TOPS

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?