Conferences >2019 Design, Automation & Tes...

Learn-to-Scale: Parallelizing Deep Learning Inference on Chip Multiprocessor Architecture

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Accelerating deep neural networks on resource-constrained embedded devices is becoming increasingly important for real-time applications. However, in contrast to the inte...Show More

Metadata

Abstract:

Accelerating deep neural networks on resource-constrained embedded devices is becoming increasingly important for real-time applications. However, in contrast to the intensive research works on specialized neural network inference architectures, there is a lack of study on the acceleration and parallelization of deep learning inference on embedded chip-multiprocessor architectures, which are favored by many real-time applications for superb energy-efficiency and scalability. In this work, we investigate the strategies of parallelizing single-pass deep neural network inference on embedded on-chip multi-core accelerators. These methods exploit the elasticity and noise-tolerance features of deep learning algorithms to circumvent the bottleneck of on-chip inter-core data moving and reduce the communication overhead aggravated as the core number scales up. The experimental results show that the communication-aware sparsified parallelization method improves the system performance by 1.6×-1.1× and achieves 4×-1.6× better interconnects energy efficiency for different neural networks.

Published in: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Date of Conference: 25-29 March 2019

Date Added to IEEE Xplore: 16 May 2019

ISBN Information:

ISSN Information:

DOI: 10.23919/DATE.2019.8715267

Conference Location: Florence, Italy

Contents

References is not available for this document.

Learn-to-Scale: Parallelizing Deep Learning Inference on Chip Multiprocessor Architecture

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Learn-to-Scale: Parallelizing Deep Learning Inference on Chip Multiprocessor Architecture

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?