ABSTRACT
Moving Particle Semi-implicit (MPS) method is a particle method used in fields such as computational fluid dynamics. It is classified as a particle method. Target fluids and objects are divided up into particles, and each particle interacts with its neighbour-particle. The search for neighbour-particle is the main bottleneck of the MPS method. In this paper, we port and optimize "search for neighbour-particle" part in MPS method for GPU by using OpenACC. It accounted for 56% of all the processing time. We present three different optimizations and evaluated them with three different data sets; 25,704, 224,910 and 2,247,750 particles. We also use four different GPUs; NVIDIA K20c, GTX1080, P100(PCIe) and P100(NVlink). As a result, P100(NVlink) GPU achieves 41.5 times speed-up compared with 24 MPI process CPU version when the number of particles is 2,247,750.
- Openacc home --- www.openacc.org. http://www.openacc.org/.Google Scholar
- S. Koshizuka and Y. Oka. Moving particle semi-implicit method for fragmentation of incompressible fluid. Nuclear Science and Engineering, 123:421--434, 1996. Google ScholarCross Ref
- J. Larkin. OpenACC Programming & Best Practices Guide, July 2015.Google Scholar
- K. Murotani, S. Koshizuka, T. Tamai, K. Shibata, N. Mitsume, S. Yoshimura, S. Tanaka, K. Hasegawa, E. Nagai, and T. Fujisawa. Development of hierarchical domain decomposition explicit mps method and application to large-scale tsunami analysis with floating objects. Journal of Advanced Simulation in Science and Engineering, 1(1):16--35, 2014. Google ScholarCross Ref
- K. Murotani, I. Masaie, T. Matsunaga, S. Koshizuka, R. Shioya, M. Ogino, and T. Fujisawa. Performance improvements of differential operators code for mps method on gpu. Computational Particle Mechanics, 2(3):261--272, 2015. Google ScholarCross Ref
- W. Seiya, A. Takayuki, T. Satori, and S. Takashi. Neighbor-particle Searching Method for Particle Simulation Based on Contact Interaction Model for GPU Computing. IPSJ Transactions on Advanced Computing Systems, 8(4):50--60, 2015.Google Scholar
- Y. Sota, A. Watanabe, and T. Kojima. Accerelation of the moving paricle semi-implicit method through multi-gpu parallel computing with dynamic domain decomposition. Journal of Japan Society of Civil Engineers, Ser. A2 (Applied Mechanics (AM)), 69(2), 2013.Google Scholar
- H. Sun, Y. Tian, Y. Zhang, J. Wu, S. Wang, Q. Yang, and Q. Zhou. A special sorting method for neighbor search procedure in smoothed particle hydrodynamics on gpus. In Parallel Processing Workshops (ICPPW), 2015 44th International Conference on, pages 81--85, Sept 2015. Google ScholarDigital Library
Recommendations
Evaluation of a Directive-Based GPU Programming Approach for High-Order Unstructured Mesh Computational Fluid Dynamics
PASC '17: Proceedings of the Platform for Advanced Scientific Computing ConferenceIn this work we evaluate the effectiveness of using OpenACC as a paradigm for the auto-parallelization of a high-order unstructured CFD code on Graphics Processing Units (GPUs). This is in lieu of hand-written CUDA or OpenCL code for the algorithms that ...
CUDA vs OpenACC: performance case studies with kernel benchmarks and a memory-bound CFD application
CCGRID '13: Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid ComputingOpenACC is a new accelerator programming interface that provides a set of OpenMP-like loop directives for the programming of accelerators in an implicit and portable way. It allows the programmer to express the offloading of data and computations to ...
Understanding Performance Portability of OpenACC for Supercomputers
IPDPSW '15: Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium WorkshopScientific applications need to be moved among supercomputers, such as Tianhe-2 and TSUBAME 2.5. OpenACC provides a directive-based approach for a single source code base with function portability across different accelerators used in the ...
Comments