Abstract
We have accelerated a legacy massively parallel code solving 3D Maxwell’s equations on a hybrid cluster enhanced with GPUs. To minimize the impact on our existing code, we combine its original Full-MPI approach with task parallelism to design an efficient accelerated \(LL^t\) solver that efficiently shares the same GPUs between different processes and relies on an optimized communication patterns. On 180 nodes of the Tera100 cluster, our GPU-accelerated \(LL^t\) decomposition reaches \(80\) TFlop/s on a problem with \(247980\) unknowns, whereas the sustained machine’s CPU and GPU peaks are respectively \(13\) and \(153\) TFlop/s.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Ltaief, H., Thibault, S., Tomov, S.: QR factorization on a multicore node enhanced with multiple GPU accelerators. In: International Parallel and Distributed Processing Symposium, pp. 932–943. IEEE (2011)
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180, 12–37 (2009). IOP Publishing
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: Starpu: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput. Pract. Exp. 23(2), 187–198 (2011)
Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Scaling hypres multigrid solvers to 100,000 cores. High-Performance Scientific Computing, pp. 261–279. Springer, London (2012)
Bosilca, G., Bouteiller, A., Herault, T., Lemarinier, P., Saengpatsa, N.O., Tomov, S., Dongarra, J.J.: Performance portability of a GPU enabled factorization with the DAGuE framework. In: IEEE Cluster, pp. 395–402 (2011)
Hesthaven, J.S., Warburton, T.: Nodal high-order methods on unstructured grids: I. time-domain solution of Maxwell’s equations. J. Comput. Phys. 181(1), 186–221 (2002)
Humphrey, J.R., Price, D.K., Spagnoli, K.E., Paolini, A.L., Kelmelis, E.J.: CULA: hybrid GPU accelerated linear algebra routines. In: SPIE Defense, Security, and Sensing, pp. 770502–770502. International Society for Optics and Photonics (2010)
Igual, F.D., Chan, E., Quintana-Ortí, E.S., Quintana-Ortí, G., Van De Geijn, R.A., Van Zee, F.G.: The flame approach: from dense linear algebra algorithms to high-performance multi-accelerator implementations. J. Parallel Distrib. Comput. 72(9), 1134–1143 (2012)
Ospici, M., Komatitsch, D., Mehaut, J.F., Deutsch, T., et al.: SGPU 2: a runtime system for using of large applications on clusters of hybrid nodes. In: Second Workshop on Hybrid Multi-Core Computing, held in Conjunction with HiPC (2011)
Song, J., Lu, C.C., Chew, W.C.: Multilevel fast multipole algorithm for electromagnetic scattering by large complex objects. IEEE Trans. Antennas Propag. 45(10), 1488–1493 (1997)
Zhao, K., Vouvakis, M.N., Lee, J.F.: The adaptive cross approximation algorithm for accelerated method of moments computations of EMC problems. IEEE Trans. Electromagn. Compat. 47(4), 763–773 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Augonnet, C., Goudin, D., Pujols, A., Sesques, M. (2014). Accelerating a Massively Parallel Numerical Simulation in Electromagnetism Using a Cluster of GPUs. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-55224-3_55
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55223-6
Online ISBN: 978-3-642-55224-3
eBook Packages: Computer ScienceComputer Science (R0)