Design of a Convolutional Neural Network Instruction Set Based on RISC-V and Its Microarchitecture Implementation

Jiao, Qiang; Hu, Wei; Wen, Yuan; Dong, Yong; Li, Zhenhao; Gan, Yu

doi:10.1007/978-3-030-60239-0_6

Qiang Jiao⁹,
Wei Hu⁹,
Yuan Wen¹⁰,
Yong Dong⁹,
Zhenhao Li⁹ &
…
Yu Gan⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12453))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2102 Accesses
1 Citations

Abstract

The success of Convolution Neural Network (CNN) in computer vision presents a continuing challenge on performance requirement in both training and inference processes. Various software optimization has been examined towards existing hardware devices such as CPU and GPU to meet the computation needs; however, the performance gap between ideal and reality will keep going if there is short of hardware support. In this paper, we propose a customized CNN processor by extending the RISC-V instruction set. We have added six primary instructions by analyzing and abstracting the characteristics of conventional CNN models. The target micro-architecture has been upgraded accordingly to exploit the parallelism in the massive data access. We evaluated our work on the broadly used CNN model, LeNet-5, on Field Programmable Gate Arrays (FPGA) for the correctness validation. Comparing to traditional x86 and MIPS ISAs, our design provides a higher code density and performance efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abtahi, T., Kulkarni, A., Mohsenin, T.: Accelerating convolutional neural network with FFT on tiny cores, pp. 1–4 (May 2017). https://doi.org/10.1109/ISCAS.2017.8050588
Chen, T., et al.: TVM: end-to-end optimization stack for deep learning. CoRR abs/1802.04799 (2018)
Google Scholar
Chitsaz, K., Hajabdollahi, M., Karimi, N., Samavi, S., Shirani, S.: Acceleration of convolutional neural network using FFT-based split convolutions. CoRR abs/2003.12621 (2020)
Google Scholar
Flamand, E., et al.: GAP-8: a RISC-V SoC for AI at the Edge of the IoT. In: 2018 IEEE 29th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 1–4 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE Computer Society (2016)
Google Scholar
Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)1 mb model size. CoRR abs/1602.07360 (2016)
Google Scholar
Kala, S., Jose, B.R., Mathew, J., Nalesh, S.: High-performance CNN accelerator on FPGA using unified Winograd-GEMM architecture. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 27(12), 2816–2828 (2019)
Article Google Scholar
Kala, S., Mathew, J., Jose, B.R., Nalesh, S.: UniWiG: unified Winograd-GEMM architecture for accelerating CNN on FPGAs. In: 2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID), pp. 209–214 (2019)
Google Scholar
Kim, H., Nam, H., Jung, W., Lee, J.: Performance analysis of CNN frameworks for GPUs. In: 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 55–64 (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, Y.: An agile approach to building RISC-V microprocessors. IEEE Micro 36(2), 8–20 (2016)
Article Google Scholar
Li, C., Yang, Y., Feng, M., Chakradhar, S., Zhou, H.: Optimizing memory efficiency for deep convolutional neural networks on GPUs. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, pp. 633–644 (2016)
Google Scholar
Li, X., Liang, Y., Yan, S., Jia, L., Li, Y.: A coordinated tiling and batching framework for efficient GEMM on GPUs. In: PPoPP, pp. 229–241. ACM (2019)
Google Scholar
Li, Z., Hu, W., Chen, S.: Design and implementation of CNN custom processor based on RISC-V architecture. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1945–1950 (2019)
Google Scholar
Lou, W., Wang, C., Gong, L., Zhou, X.: RV-CNN: flexible and efficient instruction set for CNNs based on RISC-V processors. In: Yew, P.-C., Stenström, P., Wu, J., Gong, X., Li, T. (eds.) APPT 2019. LNCS, vol. 11719, pp. 3–14. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29611-7_1
Chapter Google Scholar
Luo, J., Zhang, H., Zhou, H., Xie, C., Wu, J., Lin, W.: ThiNet: pruning CNN filters for a thinner net. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2525–2538 (2019)
Article Google Scholar
Moreau, T., Chen, T., Jiang, Z., Ceze, L., Guestrin, C., Krishnamurthy, A.: VTA: an open hardware-software stack for deep learning. CoRR abs/1807.04188 (2018)
Google Scholar
Porter, R., Morgan, S., Biglari-Abhari, M.: Extending a soft-core RISC-V processor to accelerate CNN inference. In: 2019 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 694–697 (2019)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Strigl, D., Kofler, K., Podlipnig, S.: Performance and scalability of GPU-based convolutional neural networks. In: 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp. 317–324 (2010)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR, pp. 1–9. IEEE Computer Society (2015)
Google Scholar
Vasudevan, A., Anderson, A., Gregg, D.: Parallel multi-channel convolution using general matrix multiplication. In: 2017 IEEE 28th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 19–24 (2017)
Google Scholar
Waterman, A., Lee, Y., Avizienis, R., Cook, H., Patterson, D.A., Asanovic, K.: The RISC-V instruction set. In: Hot Chips Symposium, p. 1. IEEE (2013)
Google Scholar
Yu, J., et al.: Instruction driven cross-layer CNN accelerator with Winograd transformation on FPGA. In: 2017 International Conference on Field Programmable Technology (ICFPT), pp. 227–230 (2017)
Google Scholar

Download references

Acknowledgement

This work was supported by Science Foundation Ireland grant 13/RC/2094 to Lero - The Irish Software Research Centre.

Author information

Authors and Affiliations

Wuhan University of Science and Technology, Wuhan, China
Qiang Jiao, Wei Hu, Yong Dong, Zhenhao Li & Yu Gan
Trinity College Dublin, Dublin, Ireland
Yuan Wen

Authors

Qiang Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yong Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhenhao Li
View author publications
You can also search for this author in PubMed Google Scholar
Yu Gan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiang Jiao .

Editor information

Editors and Affiliations

Columbia University, New York, NY, USA
Meikang Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiao, Q., Hu, W., Wen, Y., Dong, Y., Li, Z., Gan, Y. (2020). Design of a Convolutional Neural Network Instruction Set Based on RISC-V and Its Microarchitecture Implementation. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12453. Springer, Cham. https://doi.org/10.1007/978-3-030-60239-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-60239-0_6
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60238-3
Online ISBN: 978-3-030-60239-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics