Skip to main content

Accelerating Radiative Transfer Simulation on NVIDIA GPUs with OpenACC

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2022)

Abstract

To accelerate multiphysics applications, making use of not only GPUs but also FPGAs has been emerging. Multiphysics applications are simulations involving multiple physical models and multiple simultaneous physical phenomena. Operations with different performance characteristics appear in the simulation, making the acceleration of simulation speed using only GPUs difficult. Therefore, we aim to improve the overall performance of the application by using FPGAs to accelerate operations with characteristics which cause lower GPU efficiency. However, the application is currently implemented through multilingual programming, where the computation kernel running on the GPU is written in CUDA and the computation kernel running on the FPGA is written in OpenCL. This method imposes a heavy burden on programmers; therefore, we are currently working on a programming environment that enables to use both accelerators in a GPU–FPGA equipped high-performance computing (HPC) cluster system with OpenACC. To this end, we port the entire code only with OpenACC from the CUDA-OpenCL mixture. On this basis, this study quantitatively investigates the performance of the OpenACC GPU implementation compared to the CUDA implementation for ARGOT, a radiative transfer simulation code for fundamental astrophysics which is a multiphysics application. We observe that the OpenACC implementation achieves performance and scalability comparable to the CUDA implementation on the Cygnus supercomputer equipped with NVIDIA V100 GPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boku, T., Fujita, N., Kobayashi, R., Tatebe, O.: Cygnus - world first multihybrid accelerated cluster with GPU and FPGA coupling. In: Workshop Proceedings of the 51st International Conference on Parallel Processing, ICPP Workshops ’22. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3547276.3548629

  2. Fujita, N., et al.: OpenCL-enabled parallel raytracing for astrophysical application on multiple FPGAs with optical links. In: 2020 IEEE/ACM International Workshop on Heterogeneous High-Performance Reconfigurable Computing (H2RC), pp. 48–55 (2020). https://doi.org/10.1109/H2RC51942.2020.00011

  3. Gorski, K.M., et al.: HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J. 622(2), 759–771 (2005). https://doi.org/10.1086/427976

  4. Hoshino, T., Maruyama, N., Matsuoka, S., Takaki, R.: CUDA vs OpenACC: performance case studies with kernel benchmarks and a memory-bound CFD application. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 136–143 (2013). https://doi.org/10.1109/CCGrid.2013.12

  5. Kobayashi, R., et al.: Multi-hybrid accelerated simulation by GPU and FPGA on radiative transfer simulation in astrophysics. J. Inf. Process. 28, 1073–1089 (2020). https://doi.org/10.2197/ipsjjip.28.1073

    Article  Google Scholar 

  6. Lee, S., Kim, J., Vetter, J.S.: OpenACC to FPGA: a framework for directive-based high-performance reconfigurable computing. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 544–554 (2016). https://doi.org/10.1109/IPDPS.2016.28

  7. Li, X., Shih, P.C.: Performance comparison of CUDA and OpenACC based on optimizations. In: Proceedings of the 2018 2nd High Performance Computing and Cluster Technologies Conference, HPCCT 2018, pp. 53–57. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3234664.3234681

  8. Memeti, S., Li, L., Pllana, S., Kołodziej, J., Kessler, C.: Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, ARMS-CC 2017, pp. 1–6. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3110355.3110356

  9. Okamoto, T., Yoshikawa, K., Umemura, M.: ARGOT: accelerated radiative transfer on grids using oct-tree. Monthly Not. R. Astron. Soc. 419(4), 2855–2866 (2012). https://doi.org/10.1111/j.1365-2966.2011.19927.x

    Article  Google Scholar 

  10. Tanaka, S., Yoshikawa, K., Okamoto, T., Hasegawa, K.: A new ray-tracing scheme for 3D diffuse radiation transfer on highly parallel architectures. Publ. Astron. Soc. Jpn. 67(4), 62 (2015). https://doi.org/10.1093/pasj/psv027

  11. Tsunashima, R., et al.: OpenACC unified programming environment for GPU and FPGA multi-hybrid acceleration. In: 13th International Symposium on High-level Parallel Programming and Applications (HLPP) (2020)

    Google Scholar 

Download references

Acknowledgements

This work used computational resources of TSUBAME3.0 provided by Tokyo Institute of Technology through the HPCI System Research Project (Project ID: hp190099). This work was supported by JSPS KAKENHI (Grant Number 21H04869). The Cygnus utilization is supported by the MCRP 2022 Program by the Center for Computational Sciences, University of Tsukuba. We also thank Dr. Naruhiko Tan of NVIDIA for his advice on OpenACC optimization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryohei Kobayashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kobayashi, R. et al. (2023). Accelerating Radiative Transfer Simulation on NVIDIA GPUs with OpenACC. In: Takizawa, H., Shen, H., Hanawa, T., Hyuk Park, J., Tian, H., Egawa, R. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2022. Lecture Notes in Computer Science, vol 13798. Springer, Cham. https://doi.org/10.1007/978-3-031-29927-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29927-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29926-1

  • Online ISBN: 978-3-031-29927-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics