ABSTRACT
With the rise of artificial intelligence and machine learning, many applications and services require FPGA support to speed up the training process and improve efficiency. FPGA has its unique advantages including inference; but now TensorFlow only supports CPU and GPU, TPU, does not support FPGA, cannot use FPGA to accelerate specific models, and cannot play the full role of FPGA in TensorFlow. Based on the above problems, this paper proposes a method for efficiently using FPGA in TensorFlow. This method uses TensorFlow's original device management mechanism, adds an abstract method for FPGA devices under the TensorFlow framework, and writes implementation specifications for FPGA operators. Finally, we used OpenCL to build kernels of FPGA devices, took full advantage of the parallel computing advantages of FPGA devices, and used the CNN LeNet5 model and MNIST dataset to conduct corresponding experiments. The experimental results show that the training accuracy of the two devices is basically the same. This paper provides a feasible solution for TensorFlow to use FPGA devices for neural network calculations.
- M. Abadi, M. Isard, and D. G. Murray (2017). A Computational Model for TensorFlow An Introduction, 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY), 1--7.Google ScholarDigital Library
- S. Mouselinos, V. Leon, S. Xydis, D. Soudris and K. Pekmestzi (2019). TF2FPGA: A Framework for Projecting and Accelerating Tensorflow CNNs on FPGA Platforms, In Proceedings of the 8th International Conference on Modern Circuits and Systems Technologies (MOCAST). IEEE, New York, NY, USA, 4 pages.Google ScholarCross Ref
- D. D. Kalamkar, K. Banerjee, S. Srinivasan, S. Sridharan, E. Georganas, M. E. Smorkalov, C. Xu and A. Heinecke (2019). Training Google Neural Machine Translation on an Intel CPU Cluster, IEEE International Conference on Cluster Computing (IEEE CLUSTER), 193--202.Google ScholarCross Ref
- Z. Y. Chen, L. Luo, W. Quan, S. Yang, J. Yu, M. Wen and C. Y. Zhang (2018). Multiple CNN-based Tasks Scheduling across Shared GPU Platform in Research and Development Scenarios, 20th IEEE International Conference on High Performance Computing and Communications (HPCC), 578--585.Google ScholarCross Ref
- F. F. dos Santos, P. F. Pimenta, C. Lunardi, L. Draghetti, L. Carro, D. Kaeli, and P. Rech (2019). Analyzing and Increasing the Reliability of Convolutional Neural Networks on GPUs, IEEE Transactions on Reliability, 68: 663--677.Google ScholarCross Ref
- Q. Liu, T. Liang, Z. Huang, and V. Dinavahi(2019). Real-Time FPGA-Based Hardware Neural Network for Fault Detection and Isolation in More Electric Aircraft, IEEE ACCESS, 159831--159841.Google Scholar
- Y. J. Guan, H. Liang, N. Y. Xu, W. Q. Wang, S. S. Shi, X. Chen, G. Y. Sun, W. Zhang and J. Cong(2018). FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates, 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 152--159.Google Scholar
- H. Q. Zeng, C. Zhang, and V. Prasanna(2017). Fast Generation of High Throughput Customized Deep Learning Accelerators on FPGAs, In Proceedings of International Conference on Reconfigurable Computing and FPGAs (ReConFig). IEEE, New York, NY, USA, 8 pages.Google Scholar
- U. Aydonat, S. O'Connell, D. Capalija, A. C. Ling, and G. R. Chiu, "An opencl™ deep learning accelerator on arria 10," in Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2017, pp. 55--64Google Scholar
Index Terms
- A specification that supports FPGA devices on the TensorFlow framework
Recommendations
An Optimal Design Method of Conv2d Operator for TensorFlow Based on FPGA Accelerator
CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application EngineeringCurrently, TensorFlow architecture only supports CPU and GPU programming, and has not yet formed a unified support standard for FPGAs. To the best of our knowledge, when forward operators in TensorFlow specifies a new device, the backward gradient ...
Nuclear Reactor Simulations on OpenCL FPGA Platform
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysField-programmable gate arrays (FPGAs) are becoming a promising choice as a heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The maturing high-level synthesis (HLS) ...
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysConvolutional Neural Networks (CNNs) have gained popularity in many computer vision applications such as image classification, face detection, and video analysis, because of their ability to train and classify with high accuracy. Due to multiple ...
Comments