skip to main content
10.1145/2897937.2897995acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

C-brain: a deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization

Published: 05 June 2016 Publication History

Abstract

Convolutional neural networks (CNN) accelerators have been proposed as an efficient hardware solution for deep learning based applications, which are known to be both compute-and-memory intensive. Although the most advanced CNN accelerators can deliver high computational throughput, the performance is highly unstable. Once changed to accommodate a new network with different parameters like layers and kernel size, the fixed hardware structure, may no longer well match the data flows. Consequently, the accelerator will fail to deliver high performance due to the underutilization of either logic resource or memory bandwidth. To overcome this problem, we proposed a novel deep learning accelerator, which offers multiple types of data-level parallelism: inter-kernel, intra-kernel and hybrid. Our design can adaptively switch among the three types of parallelism and the corresponding data tiling schemes to dynamically match different networks or even different layers of a single network. No matter how we change the hardware configurations or network types, the proposed network mapping strategy ensures the optimal performance and energy-efficiency. Compared with previous state-of-the-art NN accelerators, it is possible to achieve a speedup of 4.0x-8.3x for some layers of the well-known large scale CNNs. For the whole phase of network forward-propagation, our design achieves 28.04% PE energy saving, 90.3% on-chip memory energy saving on average.

References

[1]
Krizhevsky, A., et al., ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 2012.
[2]
Hinton, G., et al., Deep Neural Networks for Acoustic Modeling in Speech Recognition. IEEE Signal Processing Magazine, 2012.
[3]
Simonyan, K., et al., Very Deep Convolutional Networks for Large-Scale Image Recognition. Arxiv, 2014.
[4]
LeCun, Y., et al., Deep learning. Nature, 2015.
[5]
Coates, A., et al. Deep learning with COTS HPC systems. In Proc. of ICML, 2013.
[6]
Peemen, M., et al. Memory-centric accelerator design for convolutional neural networks. In Computer Design (ICCD), 2013.
[7]
Farabet, C., et al. Neuflow: A runtime reconfigurable dataflow processor for vision. In Proc. of CVPRW, 2011.
[8]
Chen, T., et al. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Proc. of ASPLOS, 2014.
[9]
Park, S., et al. 4.6 A1. 93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications. In Proc. of ISSCC, 2015.
[10]
Szegedy, C., et al., Going deeper with convolutions. ArXiv, 2014.
[11]
Sankaradas, M., et al. A massively parallel coprocessor for convolutional neural networks. In Proc. Of ASAP, 2009.
[12]
Cadambi, S., et al. A programmable parallel accelerator for learning and classification. In Proc. of PACT, 2010.
[13]
Chakradhar, S., et al. A dynamically configurable coprocessor for convolutional neural networks. In ACM SIGARCH Computer Architecture News, 2010.
[14]
Zhang, C., et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In Proc. of FPGA, 2015.
[15]
Du, Z., et al. ShiDianNao: shifting vision processing closer to the sensor. In Proc. of ISCA, 2015.
[16]
Lin, M., et al., <Network In Network>. In Proc.of ICLR, 2014.
[17]
Russakovsky, O., et al., Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2014.
[18]
Jia, Y., et al. Caffe: Convolutional architecture for fast feature embedding. ArXiv, 2014.

Cited By

View all
  • (2024)LLMParser: An Exploratory Study on Using Large Language Models for Log ParsingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639150(1-13)Online publication date: 20-May-2024
  • (2024)Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00083(1043-1062)Online publication date: 2-Mar-2024
  • (2023)ConvReLU++: Reference-based Lossless Acceleration of Conv-ReLU Operations on Mobile CPUProceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services10.1145/3581791.3596831(503-515)Online publication date: 18-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DAC '16: Proceedings of the 53rd Annual Design Automation Conference
June 2016
1048 pages
ISBN:9781450342360
DOI:10.1145/2897937
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

DAC '16

Acceptance Rates

Overall Acceptance Rate 1,317 of 3,929 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)4
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LLMParser: An Exploratory Study on Using Large Language Models for Log ParsingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639150(1-13)Online publication date: 20-May-2024
  • (2024)Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00083(1043-1062)Online publication date: 2-Mar-2024
  • (2023)ConvReLU++: Reference-based Lossless Acceleration of Conv-ReLU Operations on Mobile CPUProceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services10.1145/3581791.3596831(503-515)Online publication date: 18-Jun-2023
  • (2023)Optimum Resource Utilization for the Implementation of FPGA-based Fast Convolutional Algorithms for CNN Modelling2023 IEEE Women in Technology Conference (WINTECHCON)10.1109/WINTECHCON58518.2023.10276660(1-6)Online publication date: 21-Sep-2023
  • (2023)Network Pruning for Bit-Serial AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.320395542:5(1597-1609)Online publication date: May-2023
  • (2023)A high-efficiency spaceborne processor for hybrid neural networksNeurocomputing10.1016/j.neucom.2023.126230541:COnline publication date: 7-Jul-2023
  • (2022)An Efficient Deep Learning Accelerator Architecture for Compressed Video AnalysisIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.312007641:9(2808-2820)Online publication date: Sep-2022
  • (2022)A Fast Precision Tuning Solution for Always-On DNN AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.308966741:5(1236-1248)Online publication date: May-2022
  • (2022)CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP ArchitecturesIEEE Transactions on Computers10.1109/TC.2021.309968871:7(1626-1639)Online publication date: 1-Jul-2022
  • (2022)A Survey on Machine Learning Accelerators and Evolutionary Hardware PlatformsIEEE Design & Test10.1109/MDAT.2022.316112639:3(91-116)Online publication date: Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media