Abstract
We discuss a simple OpenACC implementation of the iterative BiCGSTAB algorithm for linear systems. Problems of this type arise in the numerical solution of diffusion-reaction problems, where the linear solver constitutes the most computationally expensive component of the simulation (\(\sim \)80 % of time spent) and therefore has often been the primary target for parallelization. We deploy and test this method on a desktop workstation with two supported GPUs, one targeted for high performance computing, one a consumer level GPU, to compute the temperature distribution on a honeycomb around the bee brood. The paper is written from a user’s, not from a GPU computing expert’s perspective and aims to fill a gap we noticed between real world application problems and the simple problems solved in introductory OpenACC tutorials or in benchmarking studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Čiegis, R., Čiegis, R., Jakušev, A., Šaltenienė, G.: Parallel variational iterative algorithms for solution of linear systems. Math. Mod. An. 12, 1–16 (2007)
Herdman, J.A., Gaudin, W.P., McIntosh-Smith, S., Boulton, M., Beckingsdale, D.A., Mallinston, A.C., Jarvis, S.A.: Accelerating hydrocodes with OpenACC, OpenCL and CUDA. In: Proceedings of the SC Companion: High Performance Computing Networking Storage and Analysis, pp. 465–471 (2012)
Herlihy, M., Luchangco, V.: Distributed computing and the multicore revolution. ACM SIGCAT News 39, 62–72 (2008)
Humphrey, J.A.C., Dykes, E.S.: Thermal energy conduction in a honey bee comb due to cell-heating bees. J. Theor. Biol. 250, 194–208 (2008)
Marowka, A.: Parallel computing on any desktop. Commun. ACM 50, 75–78 (2007)
Morton, K.W.: Numerical solution of convection-diffusion problems. Chapman and Hall, London (1996)
Muhammad, N., Eberl, H.J.: OpenMP parallelization of a mickens time-integration scheme for a mixed-culture biofilm model and its performance on multi-core and multi-processor computers. In: Mewhort, D.J.K., Cann, N.M., Slater, G.W., Naughton, T.J. (eds.) HPCS 2009. LNCS, vol. 5976, pp. 180–195. Springer, Heidelberg (2010)
The OpenACC Application Programming Interface Version 1.0. http://www.openacc.org/sites/default/files/OpenACC.1.0_0.pdf
Poole, D.: Introduction to OpenACC Directives. http://on-demand.gputechconf.com/gtc/2012/presentations/S0517A-Monday-Programming-GPUs-OpenACC.pdf
The Portland Group, PGI Accelerator Compilers OpenACC Getting Started Guide, Version 13.2. http://www.pgroup.com/doc/openACC_gs.pdf
Reyes, R., López, I., Fumero, J.J., de Sande, F.: Directive based programming for GPUs: a comparative study. In: Proceedings of the IEEE 14th International Conference on HPCC, pp. 410–417 (2012)
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)
Strelchenko, A.: Parallel programming with OpenACC directives. http://linksceem.eu/ls2/images/stories/openacc.pdf
Sudarsan, R., Thompson, C.G., Kevan, P.G., Eberl, H.J.: Flow currents and ventilation in Langstroth beehives due to brood thermoregulation efforts of honeybees. J. Theor. Biol. 295, 168–193 (2012)
Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC – first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012)
Acknowledgement
Hardware and compiler (excluding the Tesla K20c) were purchased with a NSERC-RTI Grant. The TESLA K20c was donated by NVIDIA. We thank Larry Banks for the setup of the equipment and two anonymous referees for useful suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
OpenACC Fortran Code-stubs
The following code stubs are provided for illustration only. They are copied from the source codes used here but changed to reflect the notation in the text. The authors do not guarantee their proper working if copied from here and do not assume responsibility for any damages this might cause.
Matrix-vector product \(y:=Ax\) for sparse diagonal format. In the following code, \(d\) is the number of diagonals, sparse matrix \(A\) is an \(n \times d\) array, ioff contains the offsets of the sub-diagonals. \(A,x,d, ioff\) are handed over to the subroutine, \(y\) is returned.
Scalar product \(ddt:= x^T y\). In the following code stub \(x,y\) are \(n\)-arrays handed over to the function, the real variable ddt is returned.
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eberl, H.J., Sudarsan, R. (2014). OpenACC Parallelisation for Diffusion Problems, Applied to Temperature Distribution on a Honeycomb Around the Bee Brood: A Worked Example Using BiCGSTAB. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8385. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55195-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-55195-6_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55194-9
Online ISBN: 978-3-642-55195-6
eBook Packages: Computer ScienceComputer Science (R0)