As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
OpenACC is a high-level programming model that uses directives for offloading computation to accelerators. This paper explores the benefit of using OpenACC performance tuning directives to manually specify GPU scheduling, versus the scheduling OpenACC applies by default. We performed manual scheduling using gang and vector clauses in a directive, and applied to matrix-matrix multiply and Classical Gram-Schmidt orthonormalisation test cases. We then tested using the NVIDIA M2090 and K20 GPGPUs, in conjunction with both the PGI and CAPS implementations of OpenACC. The speedup realised by tuning the gang and vector values ranged from 1.0 to 3.1 in the test cases examined. This shows that the gang and vector values have a large impact on performance, and in some cases the compilers are able to automatically select ideal gang and vector values.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.