Skip to main content

Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled

  • Conference paper
  • First Online:
High Performance Computing. ISC High Performance 2024 International Workshops (ISC High Performance 2023)

Abstract

This paper investigates the integration of SIMT and scalar cores using the RISC-V based Vortex GPGPU. Initially, we detail a conventional integration with Purdue’s SoCET SoC AFTx07 that follows the standard host-device CPU-GPU model found in contemporary products. Subsequently, we propose two innovative architectures designed to address control flow divergence, which impedes efficiency in parallel computing by causing threads to follow divergent execution paths. The first architecture introduces a system where threads are statically prioritized based on degrees of divergence: high-priority threads (highly divergent) are allocated to a scalar core, and lower-priority (less divergent) ones to the SIMT core, based on modifications to the Vortex GPU. Although preliminary results show improved performance for scalar core threads, the static nature of thread priority assignment results in unpredictable performance enhancements due to the scheduler’s limited foresight on runtime fluctuations of thread divergence. The second architecture, currently under development, proposes a mechanism for runtime thread migration, setting a foundation for a system capable of adjusting to runtime conditions. A future, conceptual third architecture aims to dynamically assess the divergence of each thread, optimizing the integration of SIMT and scalar cores for advanced computing. This progression outlines a strategic approach to mitigate control flow divergence, promising a significant leap towards achieving higher efficiency in parallel processing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Elsabbagh, F., Asgari, B., Kim, H., Yalamanchili, S.: Vortex RISC-V GPGPU System: Extending the ISA, Synthesizing the Microarchitecture, and Modeling the Software Stack. CARRV (2019)

    Google Scholar 

  2. Blaise, T., et al.: Vortex: An Open Source Reconfigurable RISC-V GPGPU Accelerator for Architecture Research. Hot Chips 32 (2020)

    Google Scholar 

  3. Tine, B., Yalamarthy, K.P., Elsabbagh, F., Hyesoon, K.: Vortex: extending the RISC-v isa for GPGPU and 3D-Graphics. In: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (2021). https://doi.org/10.1145/3466752.3480128

  4. Fung, W.W.L., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic warp formation and scheduling for efficient GPU control flow. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) (2007). https://doi.org/10.1109/micro.2007.30

  5. Rhu, M., Erez, M.: The dual-path execution model for efficient GPU control flow. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (2013). https://doi.org/10.1109/hpca.2013.6522352

  6. White Paper | AMD GRAPHICS CORES NEXT (GCN) ARCHITECTURE (2012)

    Google Scholar 

  7. Vetter, J.S., Mittal, S.: Opportunities for nonvolatile memory systems in extreme-scale high-performance computing. Comput. Sci. Eng. 17, 73–82 (2015). https://doi.org/10.1109/mcse.2015.4

    Article  Google Scholar 

  8. Luk, C.-K., Hong, S., Kim, H.: Qilin: exploitng parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (2009). https://doi.org/10.1145/1669112.1669121

  9. Nere, A., Franey, S., Hashmi, A., Lipasti, M.: Simulating cortical networks on heterogeneous multi-GPU systems. J. Parallel Distrib. Comput. 73, 953–971 (2013). https://doi.org/10.1016/j.jpdc.2012.02.006

    Article  Google Scholar 

  10. Shen, J., Varbanescu, A.L., Sips, H., et al.: Glinda: A frame work for accelerating imbalanced applications on heterogeneous platforms. Proc. ACM Int. Conf. Comput. Front. (2013). https://doi.org/10.1145/2482767.2482785

    Article  Google Scholar 

  11. Ding, S., He, J., Yan, H., Suel, T.: Using graphics processors for high performance IR query processing. In: Proceedings of the 18th International Conference on World Wide Web (2009). https://doi.org/10.1145/1526709.1526766

  12. Kadi, M.A., Janssen, B., Yudi, J., Huebner, M.: General-purpose computing with soft GPUS on FPGAS. ACM Trans. Reconf. Technol. Syst. 11, 1–22 (2018). https://doi.org/10.1145/3173548

    Article  Google Scholar 

  13. Balasubramanian, R., Gangadhar, V., Guo, Z., et al.: Miaow - an open source RTL implementation of a GPGPU. In: 2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII) (2015). https://doi.org/10.1109/coolchips.2015.7158663

  14. Covey, J., Johnson, M.C.: System-on-a-chip design as a platform for teaching design and design flow integration. In: Proceedings of the 2019 on Great Lakes Symposium on VLSI, Tysons Corner, VA (2019)

    Google Scholar 

  15. Stevens, J.R., Skubic, J., Colter, E., Swabey, M.: Purdue microbrewer: a microcontroller generator. In: RISCV Microelectronics Conference 2017 (2017)

    Google Scholar 

  16. Skubic, J., Stevens, J.R., Tan, C.Y., Johnson, M., Swabey, M.: RISCV-business: a configurable, extensible RISC-V core. In: RISCV Microelectronics Conference 2017 (2017)

    Google Scholar 

  17. Swabey, M.A., Johnson, M.C.: Satisfying ABET criterion using an industrial microelectronic skills incubator. In: 2015 IEEE International Conference on Microelectronics Systems Education (2015)

    Google Scholar 

  18. https://engineering.purdue.edu/SoC-Team#chips

  19. Waterman, A., Lee, Y., Patterson, D.A., Asanovi, K.: The RISC-V Instruction Set Manual Volume 1: User-Level ISA, Version 20 (2014). https://doi.org/10.21236/ada605735

  20. Rhu, M., Erez, M.: The dual-path execution model for efficient GPU control flow. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (2013). https://doi.org/10.1109/hpca.2013.6522352

  21. NVIDIA: PTX: Parallel Thread Execution ISA version 2.3 (2010).http://developer.nvidia.com/compute/cuda

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sooraj Chetput .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chetput, S. et al. (2025). Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled. In: Weiland, M., Neuwirth, S., Kruse, C., Weinzierl, T. (eds) High Performance Computing. ISC High Performance 2024 International Workshops. ISC High Performance 2023. Lecture Notes in Computer Science, vol 15058. Springer, Cham. https://doi.org/10.1007/978-3-031-73716-9_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73716-9_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73715-2

  • Online ISBN: 978-3-031-73716-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics