research-article

Loom: exploiting weight and activation precisions to accelerate convolutional neural networks

Authors:

Alberto Delmas Lascorz,

Andreas MoshovosAuthors Info & Claims

DAC '18: Proceedings of the 55th Annual Design Automation Conference

Article No.: 20, Pages 1 - 6

https://doi.org/10.1145/3195970.3196072

Published: 24 June 2018 Publication History

Abstract

Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented. In LM every bit of data precision that can be saved translates to proportional performance gains. For both weights and activations LM exploits profile-derived per layer precisions. However, at runtime LM further trims activation precisions at a much smaller than a layer granularity. On average, across several image classification CNNs and for a configuration that can perform the equivalent of 128 16b × 16b multiply-accumulate operations per cycle LM outperforms a state-of-the-art bit-parallel accelerator [3] by 3.19× without any loss in accuracy while being 2.59× more energy efficient. LM can trade-off accuracy for additional improvements in execution performance and energy efficiency and compares favorably to an accelerator that targeted only activation precisions.

References

[1]

Jorge Albericio, Alberto Delmás, Patrick Judd, Sayeh Sharify, Gerard O'Leary, Roman Genov, and Andreas Moshovos. 2017. Bit-pragmatic Deep Neural Network Computing. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-50 '17). 382--394.

Digital Library

[2]

Jorge Albericio, Patrick Judd, Tayler Hetherington, Tor Aamodt, Natalie Enright Jerger, and Andreas Moshovos. 2016. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing. In 2016 IEEE/ACM International Conference on Computer Architecture (ISCA).

Digital Library

[3]

Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and O. Temam. 2014. DaDianNao: A Machine-Learning Supercomputer. In Microarchitecture (MICRO), 2014 47th Annual IEEE/ACM International Symposium on. 609--622.

Digital Library

[4]

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16). IEEE Press, Piscataway, NJ, USA, 243--254.

Digital Library

[5]

Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor Aamodt, Natalie Enright Jerger, Raquel Urtasun, and Andreas Moshovos. 2015. Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets. arXiv:1511.05236v4 {cs.LG} (2015).

[6]

Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor Aamodt, and Andreas Moshovos. 2016. Stripes: Bit-serial Deep Neural Network Computing. In Proc. of the 49th Annual IEEE/ACM Intl' Symposium on Microarchitecture.

Digital Library

[7]

Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor M Aamodt, Natalie Enright Jerger, and Andreas Moshovos. Proteus: Exploiting numerical precision variability in deep neural networks. In Proceedings of the 2016 International Conference on Supercomputing. 23.

Digital Library

[8]

Alberto Delmas Lascorz, Sayeh Sharify, Patrick Judd, and Andreas Moshovos. 2017. Dynamic Stripes: Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks. CoRR abs/1706.00504 (2017). arXiv:1706.00504 http://arxiv.org/abs/1706.00504

[9]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. arXiv:1512.02325 {cs.CV} (2016).

[10]

Rastislav Lukac. 2016. Computational photography: methods and applications. CRC Press.

Digital Library

[11]

Naveen Muralimanohar and Rajeev Balasubramonian. 2015. CACTI 6.0: A Tool to Understand Large Caches. (2015).

[12]

Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA '17). 27--40.

Digital Library

[13]

M. Poremba, S. Mittal, Dong Li, J.S. Vetter, and Yuan Xie. 2015. DESTINY: A tool for modeling emerging 3D NVM and eDRAM caches. In Design, Automation Test in Europe Conference Exhibition.

Digital Library

[14]

T. Szabo, L. Antoni, G. Horvath, and B. Feher. 2000. A full-parallel digital implementation for pre-trained NNs. In IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000, Vol. 2.49--54 vol.2.

Digital Library

[15]

Shijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, and Yunji Chen. 2016. Cambricon-X: An accelerator for sparse neural networks. In 49th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2016, Taipei, Taiwan, October 15-19, 2016. 1--12.

Digital Library

Cited By

Kim ELee SKim CLim HNam JSim J(2024)BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration2024 21st International SoC Design Conference (ISOCC)10.1109/ISOCC62682.2024.10762498(356-357)Online publication date: 19-Aug-2024
https://doi.org/10.1109/ISOCC62682.2024.10762498
Han MWang LXiao LZhang HCai TXu JWu YZhang CXu X(2024)BitNN: A Bit-Serial Accelerator for K-Nearest Neighbor Search in Point Clouds2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00095(1278-1292)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00095
Frickenstein LSampath SMori PVemparala MFasfous NFrickenstein AUnger CPasserone CStechele W(2024)Adversarial Robustness of Multi-bit Convolutional Neural NetworksIntelligent Systems and Applications10.1007/978-3-031-47715-7_12(157-174)Online publication date: 30-Jan-2024
https://doi.org/10.1007/978-3-031-47715-7_12
Show More Cited By

Recommendations

Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)
Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented. In LM every bit of data precision that can be saved translates to proportional performance gains. For both weights and activations LM exploits profile-...
Loom and Laconic: Hardware Accelerators for Deep Learning Algorithms
Application of BP Neural Networks on Prediction of Operating Condition of Loom
ISCID '10: Proceedings of the 2010 International Symposium on Computational Intelligence and Design - Volume 02

In order to forecast quickly the operating condition of the loom, optimize the parameters of loom production, so that the production efficiency of loom will be improved. This paper studies the prediction of the operating condition of the loom based on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '18: Proceedings of the 55th Annual Design Automation Conference

June 2018

1089 pages

ISBN:9781450357005

DOI:10.1145/3195970

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

EDAC: Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE Council on Electronic Design Automation (CEDA)

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

DAC '18

Sponsor:

EDAC
SIGDA

DAC '18: The 55th Annual Design Automation Conference 2018

June 24 - 29, 2018

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
584
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)8

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kim ELee SKim CLim HNam JSim J(2024)BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration2024 21st International SoC Design Conference (ISOCC)10.1109/ISOCC62682.2024.10762498(356-357)Online publication date: 19-Aug-2024
https://doi.org/10.1109/ISOCC62682.2024.10762498
Han MWang LXiao LZhang HCai TXu JWu YZhang CXu X(2024)BitNN: A Bit-Serial Accelerator for K-Nearest Neighbor Search in Point Clouds2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00095(1278-1292)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00095
Frickenstein LSampath SMori PVemparala MFasfous NFrickenstein AUnger CPasserone CStechele W(2024)Adversarial Robustness of Multi-bit Convolutional Neural NetworksIntelligent Systems and Applications10.1007/978-3-031-47715-7_12(157-174)Online publication date: 30-Jan-2024
https://doi.org/10.1007/978-3-031-47715-7_12
NAKAHARA YKIYAMA MAMAGASAKI MZHAO QIIDA M(2022)Reconfigurable Neural Network Accelerator and Simulator for Model ImplementationIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2021VLP0012E105.A:3(448-458)Online publication date: 1-Mar-2022
https://doi.org/10.1587/transfun.2021VLP0012
Armeniakos GZervakis GSoudris DHenkel J(2022)Hardware Approximate Techniques for Deep Neural Network Accelerators: A SurveyACM Computing Surveys10.1145/352715655:4(1-36)Online publication date: 21-Nov-2022
https://dl.acm.org/doi/10.1145/3527156
Elangovan RJain SRaghunathan A(2022)Ax-BxP: Approximate Blocked Computation for Precision-reconfigurable Deep Neural Network AccelerationACM Transactions on Design Automation of Electronic Systems10.1145/349273327:3(1-20)Online publication date: 28-Jan-2022
https://dl.acm.org/doi/10.1145/3492733
Cai HLin JLin YLiu ZTang HWang HZhu LHan S(2022)Enable Deep Learning on Mobile Devices: Methods, Systems, and ApplicationsACM Transactions on Design Automation of Electronic Systems10.1145/348661827:3(1-50)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3486618
Yang CHou JWang YZhang HWang XGeng L(2022)RNA: A Flexible and Efficient Accelerator Based on Dynamically Reconfigurable Computing for Multiple Convolutional Neural NetworksJournal of Circuits, Systems and Computers10.1142/S021812662250289931:16Online publication date: 7-Jul-2022
https://doi.org/10.1142/S0218126622502899
Liu XChen YGanesh PPan JXiong JChen DWang T(2022)HiKonv: High Throughput Quantized Convolution with Novel Bit-Wise Management and ComputationProceedings of the 27th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC52403.2022.9712553(140-146)Online publication date: 17-Jan-2022
https://dl.acm.org/doi/10.1109/ASP-DAC52403.2022.9712553
Luo MWang XYu BWang DZhou Y(2022)BWPT: Binary weight partial‐sum table for BNN/BWN accelerationElectronics Letters10.1049/ell2.1246458:9(346-348)Online publication date: 9-Mar-2022
https://doi.org/10.1049/ell2.12464
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten