Abstract:
A domain-specific processor for energy-efficient execution of Recognition and Data Mining (RM) workloads is presented. The processor consists of a 2-D array of processing...Show MoreMetadata
Abstract:
A domain-specific processor for energy-efficient execution of Recognition and Data Mining (RM) workloads is presented. The processor consists of a 2-D array of processing elements and a streaming memory hierarchy and interconnect network that are customized to efficiently execute dominant computational kernels (matrix-vector multiplication, vector dot product, L1 norm, and L2 norm) from a wide range of RM algorithms. To achieve further energy efficiency, the RM processor utilizes scalable effort design, a technique that exploits the inherent resilience of algorithms to inexactness in their constituent computations. The scalable effort RM processor adopts a cross-layer approach by combining scaling mechanisms at the algorithm, architecture, and circuit levels, to create a desirable trade off between energy consumption and output quality. Measurements from the implemented chip in 65nm CMOS indicate processing efficiencies of 569 GOPS/W-4.68 TOPS/W. The use of scalable effort design achieves energy savings of 1.2-2.3X with no loss in output quality, and 2X-20X with modest reduction in quality.
Date of Conference: 22-25 September 2013
Date Added to IEEE Xplore: 11 November 2013
Electronic ISBN:978-1-4673-6146-0