Journals & Magazines >IEEE Transactions on Computer... >Volume: 39 Issue: 9

ParaML: A Polyvalent Multicore Accelerator for Machine Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In recent years, machine learning (ML) techniques are proven to be powerful tools in various emerging applications. Traditionally, ML techniques are processed on general-...Show More

Metadata

Abstract:

In recent years, machine learning (ML) techniques are proven to be powerful tools in various emerging applications. Traditionally, ML techniques are processed on general-purpose CPUs and GPUs, but their energy efficiencies are limited due to their excessive support for flexibility. As an efficient alternative to CPUs/GPUs, hardware accelerators are still limited as they often accommodate only a single ML technique (family). However, different problems may require different ML techniques, which implies that such accelerators may achieve poor learning accuracy or even be ineffective. In this paper, we present a polyvalent accelerator architecture integrated with multiple processing cores, called ParaML, which accommodates ten representative ML techniques, including k-means, k-nearest neighbors (k-NN), naive Bayes (NB), support vector machine (SVM), linear regression (LR), classification tree (CT), deep neural network (DNN), learning vector quantization (LVQ), parzen window (PW), and principal component analysis (PCA). Benefited from our thorough analysis on computational primitives and locality properties of different ML techniques, the single-core ParaML can perform up to 1056 GOP/s (e.g., additions and multiplications) in an area of 3.51 mm2 and consumes 596 mW only, estimated by ICC and PrimeTime PX with postsynthesis netlist, respectively. Compared with the NVIDIA K20M GPU (28-nm process), the single-core ParaML (65-nm process) is 1.21× faster, and can reduce the energy by 137.93×. We also compare the single-core ParaML with other accelerators. Compared with PRINS, single-core ParaML achieves 72.09× and 2.57× energy benefit for k-NN and k-means, respectively, and speeds up each query in k-NN by 44.76×. Compared with EIE, the single-core ParaML achieves 5.02× speedup and 4.97× energy benefit with 11.62× less area when evaluating with dense DNN. Compared with TPU, the single-core ParaML achieves 2.45× better power efficiency (5647 Gop/W versus 2300 Gop/W) with 321.3...

Published in: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ( Volume: 39, Issue: 9, September 2020)

Page(s): 1764 - 1777

Date of Publication: 09 July 2019

ISSN Information:

DOI: 10.1109/TCAD.2019.2927523

Funding Agency:

Contents

References is not available for this document.

ParaML: A Polyvalent Multicore Accelerator for Machine Learning

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

ParaML: A Polyvalent Multicore Accelerator for Machine Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?