skip to main content
10.1145/3195970.3196033acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Dyhard-DNN: even more DNN acceleration with dynamic hardware reconfiguration

Published: 24 June 2018 Publication History

Abstract

Deep Neural Networks (DNNs) have demonstrated their utility across a wide range of input data types, usable across diverse computing substrates, from edge devices to datacenters. This broad utility has resulted in myriad hardware accelerator architectures. However, DNNs exhibit significant heterogeneity in their computational characteristics, e.g., feature and kernel dimensions, and dramatic variances in computational intensity, even between adjacent layers in one DNN. Consequently, accelerators with static hardware parameters run sub-optimally and leave energy-efficiency margins unclaimed. We propose DyHard-DNNs, where accelerator microarchitectural parameters are dynamically reconfigured during DNN execution to significantly improve metrics of interest. We demonstrate the effectiveness of this approach on a configurable SIMD 2D systolic array and show a 15--65% performance improvement (at iso-power) and 25--90% energy improvement (at iso-latency) over the best static configuration in six mainstream DNN workloads.

References

[1]
Albericio, J. et al. 2016. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing. In 43rd ISCA, Seoul, South Korea, June 18-22, 2016.
[2]
Chakradhar, S.T. et al. 2010. A dynamically configurable coprocessor for convolutional neural networks. In 37th ISCA, June 19-23, 2010, Saint-Malo, France.
[3]
Chen, Y. et al. 2017. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. J. Solid-State Circuits 52, 1 (2017), 127--138.
[4]
Chen, Y. et al. 2014. DaDianNao: A Machine-Learning Supercomputer. In 47th Annual IEEE/ACM MICRO, Cambridge, United Kingdom, December 13-17, 2014.
[5]
Chung, E. et al. 2017. Accelerating Persistent Neural Networks at Datacenter Scale. In Symposium on High Performance Chips (Hot Chips), 2017.
[6]
Du, Z. et al. 2015. ShiDianNao: shifting vision processing closer to the sensor. In 42nd Annual ISCA, Portland, OR, USA, June 13--17, 2015.
[7]
Eldridge, S. et al. 2015. Towards General-Purpose Neural Network Computing. In PACT 2015, San Francisco, CA, USA, October 18--21, 2015.
[8]
Guo, X. et al. 2017. Back to the Future: Digital Circuit Design in the FinFET Era. J. Low Power Electronics 13, 3 (2017), 338--355.
[9]
Han, S. et al. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In 43rd ACM/IEEE ISCA 2016, Seoul, South Korea, June 18-22, 2016.
[10]
Hannun, A.Y. et al. 2014. Deep Speech: Scaling up end-to-end speech recognition. CoRR abs/1412.5567 (2014). arXiv:1412.5567
[11]
He, K. et al. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE CVPR, Las Vegas, NV, USA, June 27-30, 2016.
[12]
Hinton, G. et al. 2012. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine 29, 6 (Nov 2012), 82--97.
[13]
Jouppi, N.P. et al. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In 44th ISCA 2017, Toronto, ON, Canada, June 24-28, 2017. 1--12.
[14]
Krizhevsky, A. et al. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS 2012, Lake Tahoe, NV, USA, December 3-6, 2012.
[15]
Li, J. et al 2016. A Persona-Based Neural Conversation Model. In 54th ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers.
[16]
Nafkha, A. et al. 2016. Accurate measurement of power consumption overhead during FPGA dynamic partial reconfiguration. In ISWCS 2016.
[17]
Reagen, B. et al. 2016. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators. In ISCA 2016, Seoul, South Korea, June 18-22,2016.
[18]
Silver, D. et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.
[19]
Simonyan, K. et al. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014). arXiv:1409.1556
[20]
Szegedy, C. et al. 2015. Going deeper with convolutions. In IEEE CVPR 2015, Boston, MA, USA, June 7-12, 2015.
[21]
Venkataramani, S. et al. 2017. ScaleDeep: A Scalable Compute Architecture for Learning and Evaluating Deep Networks. In 44th ISCA 2017, Toronto, ON, Canada, June 24-28, 2017.

Cited By

View all
  • (2023)Accelerating Deep Learning Inference via Model Parallelism and Partial Computation OffloadingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322250934:2(475-488)Online publication date: 1-Feb-2023
  • (2023)DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNsIEEE Transactions on Computers10.1109/TC.2022.318427272:3(880-892)Online publication date: 1-Mar-2023
  • (2023)Design principles for lifelong learning AI acceleratorsNature Electronics10.1038/s41928-023-01054-36:11(807-822)Online publication date: 16-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '18: Proceedings of the 55th Annual Design Automation Conference
June 2018
1089 pages
ISBN:9781450357005
DOI:10.1145/3195970
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

DAC '18
Sponsor:
DAC '18: The 55th Annual Design Automation Conference 2018
June 24 - 29, 2018
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Accelerating Deep Learning Inference via Model Parallelism and Partial Computation OffloadingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322250934:2(475-488)Online publication date: 1-Feb-2023
  • (2023)DyNNamic: Dynamically Reshaping, High Data-Reuse Accelerator for Compact DNNsIEEE Transactions on Computers10.1109/TC.2022.318427272:3(880-892)Online publication date: 1-Mar-2023
  • (2023)Design principles for lifelong learning AI acceleratorsNature Electronics10.1038/s41928-023-01054-36:11(807-822)Online publication date: 16-Nov-2023
  • (2022)A Formalism of DNN Accelerator FlexibilityProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35309076:2(1-23)Online publication date: 6-Jun-2022
  • (2022)AI accelerator on IBM Telum processorProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3533042(1012-1028)Online publication date: 18-Jun-2022
  • (2022)Fused-Layer-based DNN Model Parallelism and Partial Computation OffloadingGLOBECOM 2022 - 2022 IEEE Global Communications Conference10.1109/GLOBECOM48099.2022.10000779(5195-5200)Online publication date: 4-Dec-2022
  • (2021)Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs ComputationMicromachines10.3390/mi1211136512:11(1365)Online publication date: 5-Nov-2021
  • (2021)Practical Attacks on Deep Neural Networks by Memory TrojaningIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.299534740:6(1230-1243)Online publication date: Jun-2021
  • (2021)RaPiDProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00021(153-166)Online publication date: 14-Jun-2021
  • (2020)Performance Modeling for CNN Inference Accelerators on FPGAIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2019.289763439:4(843-856)Online publication date: Apr-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media