skip to main content
10.1145/3490700.3490708acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicacsConference Proceedingsconference-collections
research-article

A Convolutional Neural Network Accelerator Based on NVDLA

Authors Info & Claims
Published:05 January 2022Publication History

ABSTRACT

In recent years, Convolutional Neural Network (CNN) has been successfully applied to a wider range of fields, such as image recognition and natural language processing. With the application of CNN to solve more complex problems, their computing and storage requirements are also greatly increased. Traditionally, CNN is executed on CPU and GPU, but their low throughput and energy efficiency are the bottleneck of using them. Field Programmable Gate Array (FPGA) has many characteristics suitable for acceleration, it has become an ideal platform for hardware acceleration of CNN. We design and implement a convolutional neural network accelerator based on NVIDIA deep learning accelerator (NVDLA) on FPGA platform. We give the detailed structure of NVDLA, design the hardware system and software system. The neural networks that NVDLA can support are limited, but our architecture can realize the high bandwidth data communication between NVDLA and CPU. CPU handle the operations that NVDLA does not support. The accelerator will support more and more complex networks in the future.

References

  1. [1] Sze V, Chen Y H, Einer J, Hardware for machine learning: Challenges and opportunities//Proceedings of the Custom Integrated Circuits Conference. Austin, United States, 2017: 1-8.Google ScholarGoogle Scholar
  2. [2] NVIDIA. NVDLA Primer — NVDLA Documentation 2018. http://nvdla.org/primer.html(accessed October 9, 2020).Google ScholarGoogle Scholar
  3. [3] T. Chen , “Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning.” In Proc. ACM Int. Conf. Archit. Support Program. Lang. Oper. Syst. (ASPLOS), 2014, pp. 269–284.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Y. Chen, T.Luo , “DaDianNao: A Machine-learning Supercomputer,” in Proccedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 609-622, 2015.Google ScholarGoogle Scholar
  5. [5] Norman P.Jouppi , TPU: In-Datacenter Performance Analysis of a Tensor Processing Unit. In ISCA. 2017, June 24-28, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dallyy, Stanford University, NVIDIA, EIE: Effiffifficient Inference Engine on Compressed Deep Neural Network, 3 May 2016.Google ScholarGoogle Scholar
  7. [7] NVIDIA. Hardware Architectural Specification — NVDLA Documentation 2018. http://NVDLA.org/hw/v1/hwarch.html(accessed November 11, 2019).Google ScholarGoogle Scholar
  8. [8] NVIDIA Corporation. Software Manual. (2018)[Online]. Available: http://nvdla.org/sw/contents.html(accessed June 21, 2019).Google ScholarGoogle Scholar
  9. [9] Jia , (2014). “Caffe: Convolutional Architecture for Fast Feature Embedding,” [Online]. Available: https://arxiv.org/abs/1408.5093Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICACS '21: Proceedings of the 5th International Conference on Algorithms, Computing and Systems
    September 2021
    139 pages
    ISBN:9781450385084
    DOI:10.1145/3490700

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 January 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)72
    • Downloads (Last 6 weeks)9

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format