AMAIX: A Generic Analytical Model for Deep Learning Accelerators

Jünger, Lukas; Zurstraßen, Niko; Kogel, Tim; Keding, Holger; Leupers, Rainer

doi:10.1007/978-3-030-60939-9_3

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12471))

Included in the following conference series:

International Conference on Embedded Computer Systems

1465 Accesses

Abstract

In recent years the growing popularity of Convolutional Neural Networks (CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerators (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN inference performance on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy of up to 88% and 98% respectively. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators

Article Open access 24 March 2022

Assessing the Configuration Space of the Open Source NVDLA Deep Learning Accelerator on a Mainstream MPSoC Platform

Performance Analysis of Cambricon MLU100

References

NVDLA Github Repository. https://github.com/nvdla. Accessed 27 July 2019
Alwani, M., Chen, H., Ferdman, M., Milder, P.: Fused-layer CNN accelerators. In: 49th IEEE/ACM International Symposium on Microarchitecture (MICRO) (2016)
Google Scholar
Bratt, I.: Arm’s first-generation machine learning processor. In: IEEE Hot Chips 30 Symposium (2018)
Google Scholar
Chen, Y., Emer, J.S., Sze, V.: Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks. CoRR (2018)
Google Scholar
Zhang, C., Li, P., Guangyu, S.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (2015)
Google Scholar
Hill, M.D., Reddi, V.J.: Gables: a roofline model for mobile SoCs. In: IEEE International Symposium on High Performance Computer Architecture (HPCA) (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS’12 Proceedings of the 25th International Conference on Neural Information Processing Systems (2012)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, vol. 86 (1998)
Google Scholar
Jouppi, N.P., Young, C., Patil, N.: In-datacenter performance of a tensor processing unit. In: 44th International Symposium on Computer Architecture (ISCA) (2017)
Google Scholar
Reagen, B., et al.: Minerva: enabling low-power, highly-accurate deep neural network accelerators. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016)
Google Scholar
Venkataramanan, G.: Compute and redundancy solution for the full self-driving computer. In: IEEE Hot Chips 31 Symposium (2019)
Google Scholar
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. ACM Commun. 52, 65–76 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Communication Technologies and Embedded Systems (ICE), RWTH Aachen University, Aachen, Germany
Lukas Jünger, Niko Zurstraßen & Rainer Leupers
Synopsys GmbH, Aschheim, Germany
Tim Kogel & Holger Keding

Authors

Lukas Jünger
View author publications
You can also search for this author in PubMed Google Scholar
Niko Zurstraßen
View author publications
You can also search for this author in PubMed Google Scholar
Tim Kogel
View author publications
You can also search for this author in PubMed Google Scholar
Holger Keding
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Leupers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lukas Jünger .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
Alex Orailoglu
Department of Electrical and Computer Engineering, Fraunhofer IESE, Kaiserslautern, Germany
Matthias Jung
Department of Computer Science, Friedrich-Alexander University, Erlangen, Germany
Marc Reichenbach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jünger, L., Zurstraßen, N., Kogel, T., Keding, H., Leupers, R. (2020). AMAIX: A Generic Analytical Model for Deep Learning Accelerators. In: Orailoglu, A., Jung, M., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2020. Lecture Notes in Computer Science(), vol 12471. Springer, Cham. https://doi.org/10.1007/978-3-030-60939-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-60939-9_3
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60938-2
Online ISBN: 978-3-030-60939-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics