ABSTRACT
In recent years, Convolutional Neural Network (CNN) has been successfully applied to a wider range of fields, such as image recognition and natural language processing. With the application of CNN to solve more complex problems, their computing and storage requirements are also greatly increased. Traditionally, CNN is executed on CPU and GPU, but their low throughput and energy efficiency are the bottleneck of using them. Field Programmable Gate Array (FPGA) has many characteristics suitable for acceleration, it has become an ideal platform for hardware acceleration of CNN. We design and implement a convolutional neural network accelerator based on NVIDIA deep learning accelerator (NVDLA) on FPGA platform. We give the detailed structure of NVDLA, design the hardware system and software system. The neural networks that NVDLA can support are limited, but our architecture can realize the high bandwidth data communication between NVDLA and CPU. CPU handle the operations that NVDLA does not support. The accelerator will support more and more complex networks in the future.
- [1] Sze V, Chen Y H, Einer J, Hardware for machine learning: Challenges and opportunities//Proceedings of the Custom Integrated Circuits Conference. Austin, United States, 2017: 1-8.Google Scholar
- [2] NVIDIA. NVDLA Primer — NVDLA Documentation 2018. http://nvdla.org/primer.html(accessed October 9, 2020).Google Scholar
- [3] T. Chen , “Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning.” In Proc. ACM Int. Conf. Archit. Support Program. Lang. Oper. Syst. (ASPLOS), 2014, pp. 269–284.Google ScholarDigital Library
- [4] Y. Chen, T.Luo , “DaDianNao: A Machine-learning Supercomputer,” in Proccedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 609-622, 2015.Google Scholar
- [5] Norman P.Jouppi , TPU: In-Datacenter Performance Analysis of a Tensor Processing Unit. In ISCA. 2017, June 24-28, 2017.Google ScholarDigital Library
- [6] Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dallyy, Stanford University, NVIDIA, EIE: Effiffifficient Inference Engine on Compressed Deep Neural Network, 3 May 2016.Google Scholar
- [7] NVIDIA. Hardware Architectural Specification — NVDLA Documentation 2018. http://NVDLA.org/hw/v1/hwarch.html(accessed November 11, 2019).Google Scholar
- [8] NVIDIA Corporation. Software Manual. (2018)[Online]. Available: http://nvdla.org/sw/contents.html(accessed June 21, 2019).Google Scholar
- [9] Jia , (2014). “Caffe: Convolutional Architecture for Fast Feature Embedding,” [Online]. Available: https://arxiv.org/abs/1408.5093Google ScholarDigital Library
Recommendations
Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysConvolutional neural network (CNN) has been widely employed for image recognition because it can achieve high accuracy by emulating behavior of optic nerves in living creatures. Recently, rapid growth of modern applications based on deep learning ...
An FPGA-based accelerator platform implements for convolutional neural network
HP3C '19: Proceedings of the 3rd International Conference on High Performance Compilation, Computing and CommunicationsIn recent years, convolutional neural network (CNN) has become widely universal in large number of applications including computer vision, natural language processing and automatic driving. However, the CNN-based methods are computational-intensive and ...
Research and Optimization of Neural Network Accelerator Based on NVDLA
CCEAI '23: Proceedings of the 7th International Conference on Control Engineering and Artificial IntelligenceConvolutional neural network (CNN) has been widely used in image recognition and natural language processing. Its computing and storage overhead are also increased with the advent of massive amounts of data and complex models, Neural processing ...
Comments