Skip to main content
Log in

A GPU-enabled acceleration algorithm for the CAM5 cloud microphysics scheme

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The National Center for Atmospheric Research released a global atmosphere model named Community Atmosphere Model version 5.0 (CAM5), which aimed to provide a global climate simulation for meteorological research. Among them, the cloud microphysics scheme is extremely time-consuming, so developing efficient parallel algorithms faces large-scale and chronic simulation challenges. Due to the wide application of GPU in the fields of science and engineering and the NVIDIA’s mature and stable CUDA platform, we ported the code to GPU to accelerate computing. In this paper, by analyzing the parallelism of CAM5 cloud microphysical schemes (CAM5 CMS) in different dimensions, corresponding GPU-based one-dimensional (1D) and two-dimensional (2D) parallel acceleration algorithms are proposed. Among them, the 2D parallel algorithm exploits finer-grained parallelism. In addition, we present a data transfer optimization method between the CPU and GPU to further improve the overall performance. Finally, GPU version of the CAM5 CMS (GPU-CMS) was implemented. The GPU-CMS can obtain a speedup of 141.69\(\times\) on a single NVIDIA A100 GPU with I/O transfer. In the case without I/O transfer, compared to the baseline performance on a single Intel Xeon E5-2680 CPU core, the 2D acceleration algorithm obtained a speedup of 48.75\(\times\), 280.11\(\times\), and 507.18\(\times\) on a single NVIDIA K20, P100, and A100 GPU, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this study are available on request from the corresponding author.

References

  1. Collins WD, Rasch PJ, Boville BA, Hack JJ, McCaa JR, Williamson DL, Kiehl JT, Briegleb B, Bitz C, Lin S-J, et al (2004) Description of the ncar community atmosphere model (cam 3.0). NCAR Tech. Note NCAR/TN-464+ STR 226, 1326–1334

  2. Neale RB, Chen C-C, Gettelman A, Lauritzen PH, Park S, Williamson DL, Conley AJ, Garcia R, Kinnison D, Lamarque J-F et al (2010) Description of the ncar community atmosphere model (cam 5.0). NCAR Tech Note NCAR/TN-486+ STR 1(1):1–12

    Google Scholar 

  3. Conley AJ, Garcia R, Kinnison D, Lamarque J-F, Marsh D, Mills M, Smith AK, Tilmes S, Vitt F, Morrison H et al (2012) Description of the ncar community atmosphere model (cam 5.0). NCAR technical note 3

  4. Morrison H, Curry J, Khvorostyanov V (2005) A new double-moment microphysics parameterization for application in cloud and climate models. part i: description. J Atmos Sci 62(6):1665–1677

    Article  Google Scholar 

  5. Fan Z, Qiu F, Kaufman A, Yoakum-Stover S (2004) Gpu cluster for high performance computing. In: SC’04: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, pp 47–47. IEEE

  6. Deng Z, Chen D, Hu Y, Wu X, Peng W, Li X (2012) Massively parallel non-stationary eeg data processing on gpgpu platforms with morlet continuous wavelet transform. J Internet Serv Appl 3(3):347–357

    Article  Google Scholar 

  7. Chen D, Wang L, Tian M, Tian J, Wang S, Bian C, Li X (2013) Massively parallel modelling & simulation of large crowd with gpgpu. J Supercomput 63(3):675–690

    Article  Google Scholar 

  8. Yuan Y, Shi F, Kirby JT, Yu F (2020) Funwave-gpu: multiple-gpu acceleration of a boussinesq-type wave model. J Adv Model Earth Syst 12(5):e01957

    Article  Google Scholar 

  9. Sanders J, Kandrot E (2010) CUDA by Example: an Introduction to General-purpose GPU Programming, Addison-Wesley Professional

  10. Xiao D, Tong-Hua S, Jun W, Ren-Ping L (2014) Decadal variation of the aleutian low-icelandic low seesaw simulated by a climate system model (cas-esm-c). Atmos Oceanic Sci Lett 7(2):110–114

    Article  Google Scholar 

  11. Zhang H, Zhang M, Zeng Q-C (2013) Sensitivity of simulated climate to two atmospheric models: interpretation of differences between dry models and moist models. Mon Weather Rev 141(5):1558–1576

    Article  Google Scholar 

  12. Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) Gpu computing. Proc IEEE 96(5):879–899

    Article  Google Scholar 

  13. Nickolls J, Dally WJ (2010) The gpu computing era. IEEE Micro 30(2):56–69

    Article  Google Scholar 

  14. Mielikainen J, Huang B, Huang H-LA, Goldberg MD (2012) Improved gpu/cuda based parallel weather and research forecast (wrf) single moment 5-class (wsm5) cloud microphysics. IEEE J Select Topics Appl Earth Observ Remote Sensing 5(4):1256–1265

    Article  Google Scholar 

  15. Mielikainen J, Huang B, Wang J, Huang H-LA, Goldberg MD (2013) Compute unified device architecture (cuda)-based parallelization of wrf kessler cloud microphysics scheme. Comput Geosci 52:292–299

    Article  Google Scholar 

  16. Xiao H, Sun J, Bian X, Dai Z (2013) Gpu acceleration of the wsm6 cloud microphysics scheme in grapes model. Comput Geosci 59:156–162

    Article  Google Scholar 

  17. Mielikainen J, Huang B, Huang H-L, Goldberg M, Mehta A (2013) Speeding up the computation of wrf double-moment 6-class microphysics scheme with gpu. J Atmos Oceanic Tech 30(12):2896–2906

    Article  Google Scholar 

  18. Huang M, Huang B, Gu L, Huang H-LA, Goldberg MD (2015) Parallel gpu architecture framework for the wrf single moment 6-class microphysics scheme. Comput Geosci 83:17–26

    Article  Google Scholar 

  19. Kim JY, Kang J-S, Joh M (2021) Gpu acceleration of mpas microphysics wsm6 using openacc directives: performance and verification. Comput Geosci 146:104627

    Article  Google Scholar 

  20. Wang Z, Wang Y, Wang X, Li F, Zhou C, Hu H, Jiang J (2021) Gpu-rrtmg_sw: accelerating a shortwave radiative transfer scheme on gpu. IEEE Access 9:84231–84240

    Article  Google Scholar 

  21. Carlotto T, Borges Chaffe PL, Innocente dos Santos C, Lee S (2021) Sw2d-gpu: a two-dimensional shallow water model accelerated by gpgpu. Environ Modell Softw 145:105205. https://doi.org/10.1016/j.envsoft.2021.105205

    Article  Google Scholar 

  22. Cao H, Yuan L, Zhang H, Zhang Y, Wu B, Li K, Li S, Zhang M, Lu P, Xiao J (2023) Agcm-3dlf: accelerating atmospheric general circulation model via 3-d parallelization and leap-format. IEEE Trans Parallel Distrib Syst 34(3):766–780. https://doi.org/10.1109/TPDS.2022.3231013

    Article  Google Scholar 

  23. Fung J, Mann S (2004) Computer vision signal processing on graphics processing units. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp 93. IEEE

  24. Kirk D et al (2007) Nvidia cuda software and gpu parallel computing architecture. In: ISMM 7:103–104

    Google Scholar 

  25. Wolfe M et al (2012) Cuda fortran programming guide and reference. The Portland Group, Release

    Google Scholar 

  26. Ruetsch G, Fatica M (2013) CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming, Elsevier

  27. NVIDIA: CUDA Fortran Programming Guide and Reference. (2019). [Online]. available at https://www.pgroup.com/resources/docs/19.1/pdf/pgi19cudaforug.pdf

  28. Morrison H, Gettelman A (2008) A new two-moment bulk stratiform cloud microphysics scheme in the community atmosphere model, version 3 (cam3). part i: description and numerical tests. J Clim 21(15):3642–3659

    Article  Google Scholar 

  29. Wang Y, Zhao Y, Jiang J, Zhang H (2020) A novel gpu-based acceleration algorithm for a longwave radiative transfer model. Appl Sci 10(2):649

    Article  Google Scholar 

  30. NVIDIA: “CUDA C Programming Guide v10.0.”. [Online]. https://docs.nvidia.com/pdf/CUDA_C_Programming_Guide.pdf (2019)

  31. Farhatuaini L, Pulungan R (2019) Parallelization of uniformization algorithm with cuda-aware mpi. In: 2019 7th International Conference on Information and Communication Technology (ICoICT), pp 1–6. IEEE

  32. Czarnul P (2018) Parallelization of large vector similarity computations in a hybrid cpu+ gpu environment. J Supercomput 74(2):768–786

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 41931183, in part by the National Key Research and Development Program of China under Grant 2016YFB0200800, and in part by the National Key Scientific and Technological Infrastructure project “Earth System Science Numerical Simulator Facility” (Earth Lab).

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

YH helped in methodology, software, and writing—original draft; YW contributed to supervision, conceptualization, methodology, and writing—review and editing; XZ: Writing-original draft; XW, HZ, and JJ helped in writing—review and editing.

Corresponding author

Correspondence to Yuzhu Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hong, Y., Wang, Y., Zhang, X. et al. A GPU-enabled acceleration algorithm for the CAM5 cloud microphysics scheme. J Supercomput 79, 17784–17809 (2023). https://doi.org/10.1007/s11227-023-05360-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05360-7

Keywords

Navigation