Abstract
The simulation of lattice model systems for quantum materials is one of the most important approaches to understand quantum properties of matter in condensed matter physics. The main task in the simulation is to diagonalize a Hamiltonian matrix for the system and evaluate the electronic density of energy states. Kernel polynomial method (KPM) is one of the promising simulation methods. Because KPM contains a fine-grain recursive part in the algorithm, it is hard to parallelize it under the thread level parallelism such as on a supercomputer or a cluster computer. This paper focuses on methods to parallelize KPM on a massively parallel environment of GPU, aiming to achieve high parallelism for more speedups than the recent CPUs. This paper proposes two implementation methods called the full map and the sliding window methods, and evaluates the performances in the recent GPU platform. To enlarge available simulation sizes and at the same time to enhance the performance, this paper also describes additional optimization techniques depending on the GPU architecture.
Similar content being viewed by others
References
Bednorz J.G., Müller K.A.: Possible high T c superconductivity in the Ba-La-Cu-O system. Z. Phys. B Condens. Matter 64(2), 189–193 (1986)
Dagotto E.: Correlated electrons in high-temperature superconductors. Rev. Mod. Phys. 66(3), 763–840 (1994)
Ferrario M., Ciccotti G., Binder K.: Computater Simulations in Condensed Matter: From Materials to Chemical Biology, vol. 1, 2. Springer, Berlin (2006)
Foulkes W., Mitas L., Needs R., Rajagopal G.: Quantum monte carlo simulations of solids. Rev. Mod. Phys. 73(1), 33–83 (2001)
Grimes, R., Kincaid, D., Young., D.: ITPACK 2.0 User’s Guide. Technical Report CNA-150, Center for Numerical Analysis, University of Texas (1979)
Grotendorst, J., Mark, D., Muramatsu, A.: Quantum Simulations of Complex Many-Body Systems: From Theory to Algorithms. NIC-Directors (2002)
McCalpin, J.D.: Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter (1995)
Nguyen H.: GPU Gems 3, 1st edn. Addison-Wesley Professional, Reading (2007)
NVIDIA Corporation: CUDA: Compute unified device architecture programming guide. http://developer.nvidia.com/cuda
Ohno K., Esfarjani K., Kawazoe Y.: Computational Materials Science. Springer, Berlin (1999)
Schollwöck U.: The density-matrix renormalization group. Rev. Mod. Phys. 77(1), 259–315 (2005)
Varga R.: Geršgorin and His Circles. Springer Series in Computational Mathematics. Springer, Berlin (2004)
Weiße A., Wellein G., Alvermann A., Fehske H.: The kernel polynomial method. Rev. Mod. Phys. 78(1), 275–306 (2006)
White S.: Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 69(19), 2863–2866 (1992)
White S.: Density-matrix algorithms for quantum renormalization groups. Phys. Rev. B 48(14), 10345–10356 (1993)
Yamada S., Okumura M., Machida M.: Direct extension of density-matrix renormalization group to two-dimensional quantum lattice systems: studies of parallel algorithm, accuracy, and performance. J. Phys. Soc. Jpn. 78(9), 094004 (2009)
Yamashita M., Nakata N., Senshu Y., Nagata M., Yamamoto H.M., Kato R., Shibauchi T., Matsuda Y.: Highly mobile gapless excitations in a two-dimensional candidate quantum spin liquid. Science 328(5983), 1246–1248 (2010)
Zhang, S., Yamagiwa, S., Okumura, M., Yunoki, S.: Performance accelaration of kernel polynomial method applying graphics processing units. In: IPDPS/APDCM 2011, pp. 564–571. IEEE CS (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, S., Yamagiwa, S., Okumura, M. et al. Kernel Polynomial Method on GPU. Int J Parallel Prog 41, 59–88 (2013). https://doi.org/10.1007/s10766-012-0204-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-012-0204-y