Abstract
We present our findings and results of a project to port an existing large lattice QCD codebase to run on GPUs and clusters of GPUs. Our design principles from the start were to strive for both productivity and performance, while tackling the problems presented by a large constantly moving codebase. The resulting simulator reproduces the original results while running up to 11 times faster than our highly optimized CPU-code and meeting productivity requirements. Multi-GPU support was implemented using MPI and scaling across nodes shows good weak scaling. We also contemplate the consequences of the dawning of the parallel computing era from a lattice QCD point of view and analyze where state-of-the art contemporary parallel computing architecture could be improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Montvay, I., Münster, G.: Quantum Fields on a Lattice. Cambridge Monographs on Mathematical Physics. Cambridge University Press, The Edinburgh Building (1994)
Rothe, H.J.: Lattice Gauge Theories: An Introduction, 3rd edn. World Scientific Publishing Company, Hackendsack (2005)
Gupta, R.: Introduction to Lattice QCD. ArXiv High Energy Physics - Lattice e-prints (July 1998)
Fodor, Z., Hoelbling, C.: Light Hadron Masses from lattice QCD. Reviews of Modern Physics 84, 449–495 (2012)
Göckeler, M., Hägler, P., Horsley, R., Pleiter, D., Rakow, P.E.L., Schäfer, A., Schierholz, G., Zanotti, J.M.: Generalized parton distributions and structure functions from full lattice QCD. Nuclear Physics B Proceedings Supplements 140, 399–404 (2005)
Renner, D.B.: Form factors from lattice QCD. ArXiv e-prints (July 2012)
McNeile, C., Davies, C.T.H., Follana, E., Hornbostel, K., Lepage, G.P.: Heavy meson masses and decay constants from relativistic heavy quarks in full lattice QCD. ArXiv e-prints (July 2012)
Rummukainen, K.: QCD-like technicolor on the lattice. In: Llanes-Estrada, F.J., Peláez, J.R. (eds.). American Institute of Physics Conference Series, vol. 1343, pp. 51–56 (May 2011)
Petreczky, P.: Recent progress in lattice QCD at finite temperature. ArXiv e-prints (June 2009)
Alexandrou, C., Brinet, M., Carbonell, J., Constantinou, M., Guichon, P., et al.: Nucleon form factors and moments of parton distributions in twisted mass lattice QCD. In: Proceedings of The XXIst International Europhysics Conference on High Energy Physics, EPS-HEP 2011, Grenoble, Rhones Alpes France, July 21-27, vol. 308 (2011)
Winter, F.: Accelerating QDP++ using GPUs. In: Proceedings of the XXIX International Symposium on Lattice Field Theory (Lattice 2011), Squaw Valley, Lake Tahoe, California, July 10-16 (2011)
Babich, R., Clark, M.A., Joó, B., Shi, G., Brower, R.C., Gottlieb, S.: Scaling lattice qcd beyond 100 gpus. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 70:1–70:11. ACM, New York (2011)
Munshi, A.: The OpenCL specification, Version 1.2 (2011)
Bach, M., Lindenstruth, V., Philipsen, O., Pinke, C.: Lattice QCD based on OpenCL. ArXiv e-prints (September 2012)
Bonati, C., Cossu, G., D’Elia, M., Incardona, P.: QCD simulations with staggered fermions on GPUs. Computer Physics Communications 183, 853–863 (2012)
MILC: MIMD Lattice Computation (MILC) Collaboration, http://physics.indiana.edu/~sg/milc.html
Hoberock, J., Bell, N.: Thrust: A parallel template library (2010)
NVIDIA Corporation: NVIDIA GPUDirect\(^{\textrm{TM}}\) Technology (2012)
NVIDIA Corporation: NVIDIA CUDA C programming guide, Version 4.2 (2012)
Alexandru, A., Lujan, M., Pelissier, C., Gamari, B., Lee, F.: Efficient Implementation of the Overlap Operator on Multi-GPUs. In: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing, SAAHPC 2011, pp. 123–130. IEEE Computer Society, Washington, DC (2011)
NVIDIA Corporation: NVIDIA’s Next Generation CUDA(TM) Compute Architecture: Kepler(TM) GK110 – Whitepaper (2012)
CSC: IT Center for Science, http://www.csc.fi
Babich, R., Clark, M.A., Joó, B.: Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, IEEE Computer Society, Washington, DC (2010)
Sheikholeslami, B., Wohlert, R.: Improved Continuum Limit Lattice Action for QCD with Wilson Fermions. Nucl. Phys. B259, 572 (1985)
Chiu, T.W., Hsieh, T.H., Mao, Y.Y.: Pseudoscalar Meson in Two Flavors QCD with the Optimal Domain-Wall Fermion. Physics Letters B B717, 420 (2012)
UPC Consortium: UPC Language Specifications, v1.2. Tech Report LBNL-59208, Lawrence Berkeley National Lab (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rantalaiho, T. (2013). Porting Production Level Quantum Chromodynamics Code to Graphics Processing Units – A Case Study. In: Manninen, P., Öster, P. (eds) Applied Parallel and Scientific Computing. PARA 2012. Lecture Notes in Computer Science, vol 7782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36803-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-36803-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36802-8
Online ISBN: 978-3-642-36803-5
eBook Packages: Computer ScienceComputer Science (R0)