skip to main content
research-article

OpenCL-based Virtual Prototyping and Simulation of Many-Accelerator Architectures

Published:24 September 2018Publication History
Skip Abstract Section

Abstract

Heterogeneous architectures featuring multiple hardware accelerators have been proposed as a promising solution for meeting the ever-increasing performance and power requirements of embedded systems. However, the existence of numerous design parameters may result in different architectural schemes and thus in extra design effort. To address this issue, OpenCL-based frameworks have been recently utilized for FPGA programming, to enable the portability of a source code to multiple architectures. However, such OpenCL frameworks focus on RTL design, thus not enabling rapid prototyping and abstracted modeling of complex systems. Virtual Prototyping aims to overcome this problem by enabling the system modeling in higher abstraction levels. This article combines the benefits of OpenCL and Virtual Prototyping, by proposing an OpenCL-based prototyping framework for data-parallel many-accelerator systems, which (a) creates a SystemC Virtual Platform from OpenCL, (b) provides a co-simulation environment for the host and the Virtual Platform, (c) offers memory and interconnection models for parallel data processing, and (d) enables the system evaluation with alternative real number representations (e.g., fixed-point or 16-bit floating-point).

References

  1. Altera OpenCL Guide. 2015. Altera SDK for OpenCL Optimization Guide. Retrieved from https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/opencl-sdk/aocl_optimization_guide.pdf.Google ScholarGoogle Scholar
  2. AMD-OCL. 2017. AMD OpenCL. Retrieved from http://developer.amd.com/tools-and-sdks/opencl-zone/.Google ScholarGoogle Scholar
  3. Clang. 2017. Clang: A C language family frontend for LLVM. Retrieved from http://clang.llvm.org/.Google ScholarGoogle Scholar
  4. J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, Hui Huang, and G. Reinman. 2013. Composable accelerator-rich microprocessor enhanced for adaptivity and longevity. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’13). 305--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. 2012. Architecture support for accelerator-rich CMPs. In Proceedings of the 49th ACM/EDAC/IEEE Design Automation Conference (DAC’12). 843--849. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Cong, Mi. Gill, Y. Hao, G. Reinman, and B. Yuan. 2015. On-chip interconnection network for accelerator-rich architectures. In Proceedings of the 52nd Design Automation Conference (DAC’15). 8:1--8:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Deest, T. Yuki, O. Sentieys, and S. Derrien. 2014. Toward scalable source level accuracy analysis for floating-point to fixed-point conversion. In Proceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD’14). 726--733. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Diaz, C. Muñoz-Caro, and A. Niño. 2012. A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans. Parallel Distrib. Syst. 23, 8 (Aug. 2012), 1369--1386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. HAPS. 2013. Synopsys High-performance ASIC Prototyping Systems. Retrieved from http://www.synopsys.com/Systems/FPGABasedPrototyping/Pages/HAPS.aspx.Google ScholarGoogle Scholar
  10. Intel-FPGA-OCL. 2017. Intel FPGA SDK for OpenCL. Retrieved from https://www.altera.com/products/design-software/embedded-software-developers/opencl/overview.html.Google ScholarGoogle Scholar
  11. ISSI DDR3. 2016. ISSI IS43/46TR16256A(L) DDR3 Specifications. Retrieved from http://www.issi.com/WW/pdf/43-46TR16256A-85120AL.pdf.Google ScholarGoogle Scholar
  12. K. Komatsu, K. Sato, Y. Arai, K. Koyama, H. Takizawa, and H. Kobayashi. 2010. Evaluating performance and portability of OpenCL programs. In Proceedings of the 5th International Workshop on Automatic Performance Tuning. 7--22.Google ScholarGoogle Scholar
  13. A. Mahzoon and B. Alizadeh. 2017. OptiFEX: A framework for exploring area-efficient floating point expressions on FPGAs with optimized exponent/mantissa widths. IEEE Trans. Very Large Scale Integr. Syst. 25, 1 (Jan. 2017), 198--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. NVIDIA-OCL. 2017. NVIDIA OpenCL SDK. Retrieved from https://developer.nvidia.com/opencl.Google ScholarGoogle Scholar
  15. OpenCL. 2017a. OpenCL, by Khronos group. Retrieved from https://www.khronos.org/opencl/.Google ScholarGoogle Scholar
  16. OpenCL. 2017b. OpenCL Specifications 1.0. Retrieved from https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf.Google ScholarGoogle Scholar
  17. OVP. 2017. Open Virtual Platforms website. Retrieved from http://www.ovpworld.org.Google ScholarGoogle Scholar
  18. PARADE. 2015. PARADE: Full-System Accelerator-Rich Architecture Simulator. Retrieved from http://vast.cs.ucla.edu/software/parade-ara-simulator.Google ScholarGoogle Scholar
  19. S. J. Parker and V. A. Chouliaras. 2016. An OpenCL software compilation framework targeting an SoC-FPGA VLIW chip multiprocessor. J. Syst. Architect. 68 (2016), 17--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rodinia. 2016. Rodinia: A Benchmark Suite for Heterogeneous Computing. Retrieved from http://lava.cs.virginia.edu/Rodinia/.Google ScholarGoogle Scholar
  21. SDAccel. 2017. Xilinx SDAccel. Retrieved from www.xilinx.com/products/design-tools/software-zone/sdaccel.html.Google ScholarGoogle Scholar
  22. SDAccel Guide. 2016. Xilinx SDAccel Development Environment (UG1023). Retrieved from http://www.xilinx.com/support/documentation/sw_man uals/xilinx2015_1/ug1023-sdaccel-user-guide.pdf.Google ScholarGoogle Scholar
  23. E. Sotiriou-Xanthopoulos, L. Masing, K. Siozios, G. Economakos, D. Soudris, and J. Becker. 2016b. An OpenCL-based framework for rapid virtual prototyping of heterogeneous architectures. In Proceedings of the Conference on Video and Image Processing Emulation System (ViPES’16), Co-located with the International Conference on Embedded Computer Systems (IC-SAMOS’16). 372--377.Google ScholarGoogle Scholar
  24. E. Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, and G. Economakos. 2015. A virtual platform for exploring hierarchical interconnection for many-accelerator systems. In Proceedings of the International Conference on Embedded Computer Systems (IC-SAMOS’15). 384--389.Google ScholarGoogle Scholar
  25. E. Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, G. Economakos, and D. Soudris. 2014. Effective platform-level exploration for heterogeneous multicores exploiting simulation-induced slacks. In Proceedings of the Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA-DITAM’14). ACM, New York, NY, Article 13, 4 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Sotiriou-Xanthopoulos, S. Xydis, K. Siozios, G. Economakos, and D. Soudris. 2016a. An integrated exploration and virtual platform framework for many-accelerator heterogeneous systems. ACM Trans. Embed. Comput. Syst. 15, 3, Article 43 (2016a), 26 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. SystemC. 2017. SystemC official website. Retrieved from http://www.accellera.org/downloads/standards/systemc.Google ScholarGoogle Scholar
  28. Terasic. 2017. Terasic website. Retrieved from http://www.terasic.com.tw.Google ScholarGoogle Scholar
  29. Vista Virtual Prototyping. 2017. Vista Virtual Prototyping, by Mentor Graphics. Retrieved from https://www.mentor.com/esl/vista/virtual-prototyping/.Google ScholarGoogle Scholar
  30. Vivado-UserGuide. 2015. Vivado User Guide, UG902 (v2015.4). Retrieved from http://www.xilinx.com/support/documentation/sw_man uals/xilinx2015_4/ug902-vivado-high-level-synthesis.pdf.Google ScholarGoogle Scholar
  31. J. van der Zijp. 2012. Fast half float conversions. Retrieved from ftp://www.fox-toolkit.org/pub/fasthalffloatconversion.pdf.Google ScholarGoogle Scholar
  32. Zynq. 2017. Xilinx Zynq-7000 SoCs. Retrieved from http://www.xilinx.com/products/boards-and-kits/device-family/nav-zynq-7000.html.Google ScholarGoogle Scholar

Index Terms

  1. OpenCL-based Virtual Prototyping and Simulation of Many-Accelerator Architectures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format