Abstract
In this work, we investigate the global memory access mechanism on recent GPUs. For the purpose of this study, we created specific benchmark programs, which allowed us to explore the scheduling of global memory transactions. Thus, we formulate a model capable of estimating the execution time for a large class of applications. Our main goal is to facilitate optimisation of regular data-parallel applications on GPUs. As an example, we finally describe our CUDA implementations of LBM flow solvers on which our model was able to estimate performance with less than 5% relative error.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dongarra, J., Moore, S., Peterson, G., Tomov, S., Allred, J., Natoli, V., Richie, D.: Exploring new architectures in accelerating CFD for Air Force applications. In: Proceedings of HPCMP Users Group Conference, Citeseer, pp. 14–17 (2008)
nVidia: Compute Unified Device Architecture Programming Guide version 2.3.1 (August 2009)
McNamara, G.R., Zanetti, G.: Use of the Boltzmann Equation to Simulate Lattice-Gas Automata. Phys. Rev. Lett. 61, 2332–2335 (1988)
Qian, Y.H., d’Humières, D., Lallemand, P.: Lattice BGK models for Navier-Stokes equation. Europhys. Lett. 17(6), 479–484 (1992)
d’Humières, D., Ginzburg, I., Krafczyk, M., Lallemand, P., Luo, L.: Multiple-relaxation-time lattice Boltzmann models in three dimensions. Philosophical Transactions: Mathematical, Physical and Engineering Sciences, 437–451 (2002)
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D.B., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 73–82. ACM, New York (2008)
Pohl, T., Kowarschik, M., Wilke, J., Iglberger, K., Rüde, U.: Optimization and Profiling of the Cache Performance of Parallel Lattice Boltzmann Codes. Parallel Processing Letters 13(4), 549–560 (2003)
Kuznik, F., Obrecht, C., Rusaouën, G., Roux, J.J.: LBM Based Flow Simulation Using GPU Computing Processor. Computers and Mathematics with Applications (27) (June 2009)
Tölke, J., Krafczyk, M.: TeraFLOP computing on a desktop PC with GPUs for 3D CFD. International Journal of Computational Fluid Dynamics 22(7), 443–456 (2008)
Obrecht, C., Kuznik, F., Tourancheau, B., Roux, J.J.: A new approach to the lattice Boltzmann method for graphics processing units. Computers and Mathematics with Applications (in press, 2010)
Papadopoulou, M., Sadooghi-Alvandi, M., Wong, H.: Micro-benchmarking the GT200 GPU
van der Laan, W.J.: Decuda G80 dissassembler version 0.4 (2007)
Collange, S., Defour, D., Tisserand, A.: Power Consumption of GPUs from a Software Perspective. In: Proceedings of the 9th International Conference on Computational Science: Part I, p. 923. Springer, Heidelberg (2009)
Volkov, V., Demmel, J.: Benchmarking GPUs to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. IEEE Press, Piscataway (2008)
Peng, Y., Shu, C., Chew, Y.T.: A 3D incompressible thermal lattice Boltzmann model and its application to simulate natural convection in a cubic cavity. Journal of Computational Physics 193(1), 260–274 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Obrecht, C., Kuznik, F., Tourancheau, B., Roux, JJ. (2011). Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds) High Performance Computing for Computational Science – VECPAR 2010. VECPAR 2010. Lecture Notes in Computer Science, vol 6449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19328-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-19328-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19327-9
Online ISBN: 978-3-642-19328-6
eBook Packages: Computer ScienceComputer Science (R0)