Abstract
Modern image processing applications, like object detection or image segmentation, require high computation and have high memory requirements. For ASIC-/FPGA-based architectures, hardware accelerators are a promising solution, but they lack flexibility and programmability. To fulfill flexibility, computational and memory intensive characteristics of these applications in embedded systems, we propose a modular and flexible RISC-V based MPSoC architecture on Xilinx Zynq Ultrascale+ MPSoC. The proposed architecture can be ported to other Xilinx FPGAs. Two neural networks (Lenet-5 and Cifar-10 example) were used as test applications to evaluate the proposed MPSoC architectures. To increase the performance and efficiency, different optimization techniques were adapted on the MPSoC and results were evaluated. 16-bit fixed-point parameters were used to have a compression of 50% in data size and algorithms were parallelized and mapped on the proposed MPSoC to achieve higher performance. A 4x parallelization of a NN algorithm on the proposed MPSoC resulted in 3.96x speed up and consumed 3.61x less energy as compared to a single soft-core processor setup.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dorta, T., Jimenez, J., Martın, J.L., Bidarte, U., Astarloa, A.: Overview of FPGA-based multiprocessor systems. In: International Conference on Reconfigurable Computing and FPGAs, pp. 273–278, December 2009
Thomas, D.B., Howes, L., Luk, W.: A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2009, pp. 63–72. ACM, New York (2009)
Abdelouahab, K., Pelcat, M., Serot, J., Berry, F.: Accelerating CNN inference on FPGAs: a survey. CoRR abs/1806.01683 (2018). http://arxiv.org/abs/1806.01683
Ma, Y., Cao, Y., Vrudhula, S., Seo, J.: Optimizing the convolution operation to accelerate deep neural networks on FPGA. IEEE Trans. Very Large Scale Integr. VLSI Syst. 26(7), 1354–1367 (2018). https://doi.org/10.1109/TVLSI.2018.2815603
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. CoRR abs/1603.05279 (2016). http://arxiv.org/abs/1603.05279
Zhang, W.-T., et al.: Design of heterogeneous MPSoC on FPGA. In: 7th International Conference on ASIC, pp. 102–105, October 2007
Ali, M., Amini Rad, P., Göhringer, D.: Power_Monitoring_Xilinx_ZCU102, February 2020. https://github.com/TUD-ADS/Power_Monitoring_Xilinx_ZCU102
Nurvitadhi, E., Sheffield, D., Jaewoong, S., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field-Programmable Technology (FPT), pp. 77–84, December 2016
Feng, G., Hu, Z., Chen, S., Wu, F.: Energy-efficient and high-throughput FPGA-based accelerator for convolutional neural networks. In: 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), pp 624–626, October 2016
Theocharides, T., Link, G., Vijaykrishnan, N., Invin, M.J., Srikantam, V.: A generic reconfigurable neural network architecture as a network on chip. In: Proceedings of IEEE International SOC Conference, pp. 191–194, September 2004
Vainbrand, D., Ginosar, R.: Network-on-chip architectures for neural networks. In: Fourth ACM/IEEE International Symposium on Networks-on-Chip, pp. 135–144, May 2010
Thanh Bui, T.T., Phillips, B.: A scalable network-on-chip based neural network implementation on FPGAs. In: IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6, March 2019
Vestias, M.P.: A survey of convolutional neural networks on edge with reconfigurable computing. Algorithms 12(8), 154 (2019)
RISC-V. https://riscv.org/. Accessed 17 Feb 2020
Kamaleldin, A., Ali, M., Amini Rad, P., Gottschalk, M., Göhringer, D.: Modular memory system for RISC-V based MPSoCs on Xilinx FPGAs. In: IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pp. 68–73, October 2019
Davide Schiavone, P., et al.: Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications. In: 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp. 1–8, September 2017
Elmohr, M.A., et al.: RVNoC: a framework for generating RISC-V NoC-based MPSoC. In: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), pp. 617–621, March 2018
Khamis, M., El-Ashry, S., Shalaby, A., AbdElsalam, M., El-Kharashi, M.W.: A configurable RISC-V for NoC-based MPSoCs: a framework for hardware emulation. In: 11th International Workshop on Network on Chip Architectures (NoCArc), pp. 1–6, October 2018
Garofalo, A., Rusci, M., Conti, F., Rossi, D., Benini, L.: PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors. CoRR abs/1908.11263 (2019). http://arxiv.org/abs/1908.11263
Beldachi, A.F., Nunez-Yanez, J.L.: Accurate power control and monitoring in ZYNQ boards. In: 24th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4, September 2014
Nunez-Yanez, J.: Energy proportional neural network inference with adaptive voltage and frequency scaling. IEEE Trans. Comput. 68(5), 676–687 (2019)
Maxim Integrated. https://www.maximintegrated.com/en/products/power. Accessed 17 Feb 2020
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Keras: CIFAR-10 CNN. https://keras.io/examples/cifar10_cnn/. Accessed 17 Feb 2020
Pulp-platform. https://github.com/pulp-platform/ri5cy_gnu_toolchain. Accessed 17 Feb 2020
Rettkowski, J., Göhringer, D.: ASIR: application-specific instruction-set router for NoC-based MPSoCs. Computers 7(3), 38 (2018)
Gysel, P., Pimentel, J., Motamedi, M., Ghiasi, S.: Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5784–5789 (2018)
Open MPI: Open MPI: Open Source High Performance Computing. https://www.open-mpi.org/. Accessed 17 Feb 2020
Xilinx: ZCU102 Evaluation Board User Guide. https://www.xilinx.com/support/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf. Accessed 17 Feb 2020
Xilinx: Zynq UltraScale+ MPSoC Software Developer Guide. https://www.xilinx.com/support/documentation/user_guides/ug1137-zynq-ultrascale-mpsoc-swdev.pdf. Accessed 17 Feb 2020
Texas Instruments: PCA9544A Low Voltage 4-Channel I2C and SMBus Multiplexer With Interrupt Logic. http://www.ti.com/lit/ds/symlink/pca9544a.pdf. Accessed 17 Feb 2020
Texas Instruments: INA226 high-side or low-side measurement, bidirectional current and power monitor with I2C compatible interface. http://www.ti.com/lit/ds/symlink/ina226.pdf. Accessed 17 Feb 2020
Maxim Integrated: InTune automatically compensated digital pol controller with driver and pmbus telemetry. https://datasheets.maximintegrated.com/en/ds/MAX15301.pdf. Accessed 17 Feb 2020
Xilinx: MicroBlaze Soft Processor Core. https://www.xilinx.com/products/design-tools/microblaze.html. Accessed 17 Feb 2020
Feng, G., Hu, Z., Chen, S., Wu, F.: Energy-efficient and high-throughput FPGA-based accelerator for convolutional neural networks. In: 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), pp. 624–626, October 2016
Lou, W., Wang, C., Gong, L., Zhou, X.: RV-CNN: flexible and efficient instruction set for CNNs based on RISC-V processors. In: Yew, P.-C., Stenström, P., Wu, J., Gong, X., Li, T. (eds.) APPT 2019. LNCS, vol. 11719, pp. 3–14. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29611-7_1
Acknowledgments
This work has been funded partially by the German Federal Ministry of Education and Research BMBF as part of the PARIS project under grant agreement number 16ES0657 and partially by COllective Research NETworking (CORNET) project AITIA: Embedded AI Techniques for Industrial Applications. CORNET-AITIA is funded by the BMWi (Federal Ministry for Economic Affairs and Energy) under the IGF-project number: 249 EBG.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ali, M., Amini Rad, P., Göhringer, D. (2020). RISC-V Based MPSoC Design Exploration for FPGAs: Area, Power and Performance. In: Rincón, F., Barba, J., So, H., Diniz, P., Caba, J. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science(), vol 12083. Springer, Cham. https://doi.org/10.1007/978-3-030-44534-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-44534-8_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44533-1
Online ISBN: 978-3-030-44534-8
eBook Packages: Computer ScienceComputer Science (R0)