Abstract
Direct numerical simulations (DNS) of turbulent flows have increasing importance because they not only provide fundamental understanding of turbulent flows but also complement and extend experimental results. DNS of high Reynolds numbers, however, require huge computing cost so high-performance computing has been strongly pursued. In this study, we examine the feasibility of cost-efficient DNS on Intel Xeon Phi many-core processors that are currently adopted by 10% of the 100 largest supercomputers in the world as listed in the Top500 site. For this purpose, we port and optimize our in-house turbulent flow solver named as DNS-TBL (direct numerical simulation-turbulent boundary layer) on Xeon Phi Knights Landing (KNL) many-core processors and conduct benchmark tests on KNL and conventional multicore processors. The key architectural features of KNL processors and strategies to exploit them for performance enhancement are discussed. The optimized code is validated by conducting numerical simulations of zero-pressure gradient turbulent boundary layers at high Reynolds numbers and by comparing simulated turbulent statistics to those reported in previous studies. With the details of optimization strategies and validation processes, this work can serve as a practical guideline for acceleration of large-scale and precise DNS with many-core computing.
Similar content being viewed by others
References
Moin P, Mahesh K (1998) Direct numerical simulation: a tool in turbulence research. Ann Rev Fluid Mech 30:539
Smits AJ, McKeon BJ, Marusic I (2011) High-reynolds number wall turbulence. Ann Rev Fluid Mech 43:353
Hanjalić K, Launder B (1972) A reynolds stress model of turbulence and its application to thin shear flows. J Fluid Mech 52:609
Speziale CG, Sarkar S, Gatski TB (1991) Modelling the pressure-strain correlation of turbulence: an invariant dynamical systems approach. J Fluid Mech 227:245
Deardorff JW (1970) A numerical study of three-dimensional turbulent channel flow at large Reynolds numbers. J Fluid Mech 41:453
Germano M, Piomelli U, Moin P, Cabot WH (1991) A dynamic subgrid-scale eddy viscosity model. Phys Fluids 3:1760
Kim J, Moin P, Moser RD (1987) Turbulence statistics in fully developed channel flow at low Reynolds number. J Fluid Mech 177:133
Lee M, Moser RD (2015) Direct numerical simulation of turbulent channel flow up to \(Re_{\tau } \approx 5200\). J Fluid Mech 774:395
Ahn J, Lee JH, Lee J, Kang JH, Sung HJ (2015) Direct numerical simulation of a 30R long turbulent pipe flow at \(Re_{\tau } {\approx } 3008\). Phys Fluids 27:065110
Hamilton JM, Kim J, Waleffe F (1995) Regeneration mechanisms of near-wall turbulence structures. J Fluid Mech 287:317
Jiménez J, Pinelli A (1999) The autonomous cycle of near-wall turbulence. J Fluid Mech 389:335
Lozano-Durán A, Jiménez J (2014) Effect of the computational domain on direct simulations of turbulent channels up to Re\(_{\tau }{\approx }\)4200. Phys Fluids 26:011702
del Álamo JC, Jiménez J (2009) Estimation of turbulent convection velocities and correlations to Taylor’s approximation. J Fluid Mech 640:5
Monty JP, Stewart RCWJA, Chong MS (2007) Large-scale features in turbulent pipe and channel flows. J Fluid Mech 589:147
Lee JH, Sung HJ (2013) Comparison of very-large-scale motions of turbulent pipe and boundary layer simulations. Phys Fluids 25:045103
Zagarola MV, Smits AJ (1998) Mean-flow scaling of turbulent pipe flow. J Fluid Mech 373:33
Lee M, Malaya N, Moser RD (2013) Petascale direct numerical simulation of turbulent channel flow on up to 786K cores, In SC ’13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis . https://doi.org/10.1145/2503210.2503298
Pope SB (2000) Turbulent flows. Cambridge University Press, Cambridge
Lee M, Ulerich R, Malaya N, Moser RD (2014) Experiences from leadership computing in simulations of turbulent fluid flows. Comput Sci Eng 16:24
Du P, Weber R, Luszczek P, Tomov S, Peterson G, Dongarra J (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38:391
Meuer H, Strohmaier E, Dongarra J, Simon H, Meuer M (2019) The Top500 List. http://www.top500.org
Thibault JC, Senocak I (2009) CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows, In 47th AIAA Aerospace Sciences Meeting including The New Horizons Forum and Aerospace Exposition. https://doi.org/10.2514/6.2009-758
Hoshino T, Maruyama N, Matsuoka S, Takaki R (2013) 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, In 13th IEEE/ACM International Symposium on Cluster. Cloud and Grid Computing. https://doi.org/10.1109/CCGrid.2013.12
Jespersen DC (2010) Acceleration of a CFD code with a GPU. Sci Program 18:193
Mudigere D, Sridharan S, Deshpande A, Park J, Heinecke A, Smelyanskiy M, Kaul B, Dubey P, Kaushik D, Keyes D (2015) Exploring shared-memory optimizations for an unstructured mesh CFD application on modern parallel systems, In IEEE 29th Int. Parallel and Distributed Processing Symposium. https://doi.org/10.1109/IPDPS.2015.114
Economon TD, Palacios F, Alonso JJ, Bansal G, Mudigere D, Deshpande A, Heinecke A, Smelyanskiy M (2015) Towards high-performance optimizations of the unstructured open-source SU2 suite, In AIAA Infotech @ Aerospace (2015). https://doi.org/10.2514/6.2015-1949
Zhu X, Phillips E, Spandan V, Donners J, Ruetsch G, Romero J, Ostilla-Mónico R, Yang Y, Lohse D, Verzicco R, Fatica M, Stevens RJ (2018) AFiD-GPU: a versatile Navier-Stokes solver for wall-bounded turbulent flows on GPU clusters. Comput Phys Commun 229:199
Bernardini M, Modesti D, Salvadore F, Pirozzoli S (2020) STREAmS: a high-fidelity accelerated solver for direct numerical simulation of compressible turbulent flow, STREAmS: a high-fidelity accelerated solver for direct numerical simulation of compressible turbulent flow
Costa P, Phillips E, Brandt L, Fatica M (2021) GPU acceleration of CaNS for massively-parallel direct numerical simulations of canonical fluid flows. Comput Math Appl 81:502
Meng Q, Humphrey A, Schmidt J, Berzins M (2013) Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede, Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to. Discovery. https://doi.org/10.1145/2484762.2484779
Yoon M, Ahn J, Hwang J, Sung HJ (2016) Contribution of velocity-vorticity correlations to the frictional drag in wall-bounded turbulent flows. Phys Fluids 28:081702
Hwang J, Sung HJ (2017) Influence of large-scale motions on the frictional drag in a turbulent boundary layer. J Fluids Mech 829:751
Yoon M, Hwang J, Sung HJ (2018) Contribution of large-scale motions to the skin friction in a moderate adverse pressure gradient turbulent boundary layer. J Fluids Mech 848:288
Perot JB (1993) An analysis of the fractional step method. J Comput Phys 108:51
Kim K, Baek SJ, Sung HJ (2002) An implicit velocity decoupling procedure for the incompressible Navier-Stokes equations. Int J Numer Meth Fluids 38:125
Lam MD, Rothberg EE, Wolf ME (1991) The cache performance and optimizations of blocked algorithms. ACM SIGPLAN Not 26:63
Schlatter P, Örlü R (2010) Assessment of direct numerical simulation data of turbulent boundary layers. J Fluid Mech 659:116
Jacobs RG, Durbin PA (2001) Simulations of bypass transition. J Fluid Mech 428:185
Wu X, Moin P, Wallace JM, Skarda J, Lozano-Durán A, Hickey JP (2017) Transitional-turbulent spots and turbulent-turbulent spots in boundary layers. Proc Natl Acad Sci 114:5292
Acknowledgements
This work has been supported by the Korea Institute of Science and Technology Information (KISTI) institutional R&D Program (K-19-L02-C07) and the Intel Parallel Computing Center (IPCC) project funded by Intel Corporation, USA. The NURION computing resource supported by KISTI has been extensively utilized to carry out this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kang, JH., Hwang, J., Sung, H.J. et al. High-performance simulations of turbulent boundary layer flow using Intel Xeon Phi many-core processors. J Supercomput 77, 9597–9614 (2021). https://doi.org/10.1007/s11227-021-03642-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03642-6