Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled

Chetput, Sooraj; Nallathambi, Anusuya; Bowles, Spencer; Cambridge, Justin; Chitsazzadeh, Alex; Gundala, Gagan; Han, Zengxiang; Hong, Johnathan; Hu, Guilliame; Nallagatla, Ronit; Patel, Ansh; Pham, Khoi; Ramshanker, Abinands; Yan, Htet; Zhang, FangLing; Lagpacan, Zach; Hughes, Clay; Pedretti, Kevin; Johnson, Mark; Rogers, Timothy G.

doi:10.1007/978-3-031-73716-9_24

Sooraj Chetput¹¹,
Anusuya Nallathambi¹¹,
Spencer Bowles¹¹,
Justin Cambridge¹¹,
Alex Chitsazzadeh¹¹,
Gagan Gundala¹¹,
Zengxiang Han¹¹,
Johnathan Hong¹¹,
Guilliame Hu¹¹,
Ronit Nallagatla¹¹,
Ansh Patel¹¹,
Khoi Pham¹¹,
Abinands Ramshanker¹¹,
Htet Yan¹¹,
FangLing Zhang¹¹,
Zach Lagpacan¹¹,
Clay Hughes¹²,
Kevin Pedretti¹²,
Mark Johnson¹¹ &
…
Timothy G. Rogers¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15058))

Included in the following conference series:

International Conference on High Performance Computing

221 Accesses

Abstract

This paper investigates the integration of SIMT and scalar cores using the RISC-V based Vortex GPGPU. Initially, we detail a conventional integration with Purdue’s SoCET SoC AFTx07 that follows the standard host-device CPU-GPU model found in contemporary products. Subsequently, we propose two innovative architectures designed to address control flow divergence, which impedes efficiency in parallel computing by causing threads to follow divergent execution paths. The first architecture introduces a system where threads are statically prioritized based on degrees of divergence: high-priority threads (highly divergent) are allocated to a scalar core, and lower-priority (less divergent) ones to the SIMT core, based on modifications to the Vortex GPU. Although preliminary results show improved performance for scalar core threads, the static nature of thread priority assignment results in unpredictable performance enhancements due to the scheduler’s limited foresight on runtime fluctuations of thread divergence. The second architecture, currently under development, proposes a mechanism for runtime thread migration, setting a foundation for a system capable of adjusting to runtime conditions. A future, conceptual third architecture aims to dynamically assess the divergence of each thread, optimizing the integration of SIMT and scalar cores for advanced computing. This progression outlines a strategic approach to mitigate control flow divergence, promising a significant leap towards achieving higher efficiency in parallel processing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ImSPU: Implicit Sharing of Computation Resources Between Vector and Scalar Processing Units

Enabling Execution of a Legacy CFD Mini Application on Accelerators Using OpenMP

Preparing to Hit the Ground Running: Adding RISC-V Support to EESSI

References

Elsabbagh, F., Asgari, B., Kim, H., Yalamanchili, S.: Vortex RISC-V GPGPU System: Extending the ISA, Synthesizing the Microarchitecture, and Modeling the Software Stack. CARRV (2019)
Google Scholar
Blaise, T., et al.: Vortex: An Open Source Reconfigurable RISC-V GPGPU Accelerator for Architecture Research. Hot Chips 32 (2020)
Google Scholar
Tine, B., Yalamarthy, K.P., Elsabbagh, F., Hyesoon, K.: Vortex: extending the RISC-v isa for GPGPU and 3D-Graphics. In: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (2021). https://doi.org/10.1145/3466752.3480128
Fung, W.W.L., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic warp formation and scheduling for efficient GPU control flow. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) (2007). https://doi.org/10.1109/micro.2007.30
Rhu, M., Erez, M.: The dual-path execution model for efficient GPU control flow. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (2013). https://doi.org/10.1109/hpca.2013.6522352
White Paper | AMD GRAPHICS CORES NEXT (GCN) ARCHITECTURE (2012)
Google Scholar
Vetter, J.S., Mittal, S.: Opportunities for nonvolatile memory systems in extreme-scale high-performance computing. Comput. Sci. Eng. 17, 73–82 (2015). https://doi.org/10.1109/mcse.2015.4
Article Google Scholar
Luk, C.-K., Hong, S., Kim, H.: Qilin: exploitng parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (2009). https://doi.org/10.1145/1669112.1669121
Nere, A., Franey, S., Hashmi, A., Lipasti, M.: Simulating cortical networks on heterogeneous multi-GPU systems. J. Parallel Distrib. Comput. 73, 953–971 (2013). https://doi.org/10.1016/j.jpdc.2012.02.006
Article Google Scholar
Shen, J., Varbanescu, A.L., Sips, H., et al.: Glinda: A frame work for accelerating imbalanced applications on heterogeneous platforms. Proc. ACM Int. Conf. Comput. Front. (2013). https://doi.org/10.1145/2482767.2482785
Article Google Scholar
Ding, S., He, J., Yan, H., Suel, T.: Using graphics processors for high performance IR query processing. In: Proceedings of the 18th International Conference on World Wide Web (2009). https://doi.org/10.1145/1526709.1526766
Kadi, M.A., Janssen, B., Yudi, J., Huebner, M.: General-purpose computing with soft GPUS on FPGAS. ACM Trans. Reconf. Technol. Syst. 11, 1–22 (2018). https://doi.org/10.1145/3173548
Article Google Scholar
Balasubramanian, R., Gangadhar, V., Guo, Z., et al.: Miaow - an open source RTL implementation of a GPGPU. In: 2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII) (2015). https://doi.org/10.1109/coolchips.2015.7158663
Covey, J., Johnson, M.C.: System-on-a-chip design as a platform for teaching design and design flow integration. In: Proceedings of the 2019 on Great Lakes Symposium on VLSI, Tysons Corner, VA (2019)
Google Scholar
Stevens, J.R., Skubic, J., Colter, E., Swabey, M.: Purdue microbrewer: a microcontroller generator. In: RISCV Microelectronics Conference 2017 (2017)
Google Scholar
Skubic, J., Stevens, J.R., Tan, C.Y., Johnson, M., Swabey, M.: RISCV-business: a configurable, extensible RISC-V core. In: RISCV Microelectronics Conference 2017 (2017)
Google Scholar
Swabey, M.A., Johnson, M.C.: Satisfying ABET criterion using an industrial microelectronic skills incubator. In: 2015 IEEE International Conference on Microelectronics Systems Education (2015)
Google Scholar
https://engineering.purdue.edu/SoC-Team#chips
Waterman, A., Lee, Y., Patterson, D.A., Asanovi, K.: The RISC-V Instruction Set Manual Volume 1: User-Level ISA, Version 20 (2014). https://doi.org/10.21236/ada605735
Rhu, M., Erez, M.: The dual-path execution model for efficient GPU control flow. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA) (2013). https://doi.org/10.1109/hpca.2013.6522352
NVIDIA: PTX: Parallel Thread Execution ISA version 2.3 (2010).http://developer.nvidia.com/compute/cuda

Download references

Author information

Authors and Affiliations

Purdue University, West Lafayette, IN, 49706, USA
Sooraj Chetput, Anusuya Nallathambi, Spencer Bowles, Justin Cambridge, Alex Chitsazzadeh, Gagan Gundala, Zengxiang Han, Johnathan Hong, Guilliame Hu, Ronit Nallagatla, Ansh Patel, Khoi Pham, Abinands Ramshanker, Htet Yan, FangLing Zhang, Zach Lagpacan, Mark Johnson & Timothy G. Rogers
Sandia National Laboratories, Albuquerque, NM, 87123, USA
Clay Hughes & Kevin Pedretti

Authors

Sooraj Chetput
View author publications
You can also search for this author in PubMed Google Scholar
Anusuya Nallathambi
View author publications
You can also search for this author in PubMed Google Scholar
Spencer Bowles
View author publications
You can also search for this author in PubMed Google Scholar
Justin Cambridge
View author publications
You can also search for this author in PubMed Google Scholar
Alex Chitsazzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Gagan Gundala
View author publications
You can also search for this author in PubMed Google Scholar
Zengxiang Han
View author publications
You can also search for this author in PubMed Google Scholar
Johnathan Hong
View author publications
You can also search for this author in PubMed Google Scholar
Guilliame Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ronit Nallagatla
View author publications
You can also search for this author in PubMed Google Scholar
Ansh Patel
View author publications
You can also search for this author in PubMed Google Scholar
Khoi Pham
View author publications
You can also search for this author in PubMed Google Scholar
Abinands Ramshanker
View author publications
You can also search for this author in PubMed Google Scholar
Htet Yan
View author publications
You can also search for this author in PubMed Google Scholar
FangLing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zach Lagpacan
View author publications
You can also search for this author in PubMed Google Scholar
Clay Hughes
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Pedretti
View author publications
You can also search for this author in PubMed Google Scholar
Mark Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Timothy G. Rogers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sooraj Chetput .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Johannes Gutenberg University Mainz, Mainz, Germany
Sarah Neuwirth
Cerfacs, Toulouse, France
Carola Kruse
Durham University, Durham, UK
Tobias Weinzierl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chetput, S. et al. (2025). Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled. In: Weiland, M., Neuwirth, S., Kruse, C., Weinzierl, T. (eds) High Performance Computing. ISC High Performance 2024 International Workshops. ISC High Performance 2023. Lecture Notes in Computer Science, vol 15058. Springer, Cham. https://doi.org/10.1007/978-3-031-73716-9_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-73716-9_24
Published: 14 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73715-2
Online ISBN: 978-3-031-73716-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ImSPU: Implicit Sharing of Computation Resources Between Vector and Scalar Processing Units

Enabling Execution of a Legacy CFD Mini Application on Accelerators Using OpenMP

Preparing to Hit the Ground Running: Adding RISC-V Support to EESSI

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Integrating RISC-V SIMT and Scalar Cores: Loosely to Tightly Coupled

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ImSPU: Implicit Sharing of Computation Resources Between Vector and Scalar Processing Units

Enabling Execution of a Legacy CFD Mini Application on Accelerators Using OpenMP

Preparing to Hit the Ground Running: Adding RISC-V Support to EESSI

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation