ABSTRACT
Power grid analysis for modern LSI is computationally challenging in terms of both runtime and memory usage. In this paper, we implement Krylov subspace based linear circuit solvers on a graphics processing unit (GPU) to realize fast power grid analysis. Efficiencies of memory space and access performance are pursued by improving a data structure that stores elements of large sparse matrices. Experimental results on benchmark circuits show that the proposed data structures are more suitable than widely used compressed sparse row (CSR) format and our GPU implementations can achieve up to 17x speedup over CPU implementations.
- J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, G. Mittal, E. Chan, Y. Chan, D. Plass, S. Chu, H. Le, L. Clark, J. Ripley, S. Taylor, J. Dilullo, and M. Lanzerotti, ''Design of the Power6 microprocessor,'' in ISSCC Dig. Tech. Papers, Feb. 2007, pp. 96--97.Google Scholar
- S. R. Nassif, ''Power grid analysis benchmarks,'' in Proc. ASPDAC, Jan. 2008, pp. 376--381. Google ScholarDigital Library
- M. R. Hestenes and E. Stiefel, ''Methods of conjugate gradients for solving linear systems,'' J. Res. Natl. Bur. Stand., vol. 49, no. 6, pp. 409--436, Dec. 1952.Google ScholarCross Ref
- Y. Saad and M. H. Schultz, ''GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems,'' SIAM J. Sci. and Stat. Comput., vol. 7, pp. 856--869, Jul. 1986. Google ScholarDigital Library
- J. Shi, Y. Cai, W. Hou, L. Ma, S.-D. Tan, P.-H. Ho, and X. Wang, ''GPU friendly fast Poisson solver for structured power grid network analysis,'' in Proc. DAC, Jul. 2009, pp. 178--183. Google ScholarDigital Library
- Z. Feng and P. Li, ''Multigrid on GPU: Tackling power grid analysis on parallel SIMT platforms,'' in Proc. ICCAD, Nov. 2008, pp. 647--654. Google ScholarDigital Library
- Z. Feng, X. Zhao, and Z. Zeng, ''Robust parallel preconditioned power grid simulation on GPU with adaptive runtime performance modeling and optimization,'' IEEE Trans. CAD, vol. 30, no. 4, pp. 562--573, Apr. 2011. Google ScholarDigital Library
- NVIDIA, ''CUDA Zone,'' http://www.nvidia.com/object/cuda_home_new.html.Google Scholar
- C.-H. Chou, N.-Y. Tsai, H. Yu, C.-R. Lee, Y. Shi, and S.-C. Chang, ''On the preconditioner of conjugate gradient method - a power grid simulation perspective,'' in Proc. ICCAD, Nov. 2011, pp. 494--497. Google ScholarDigital Library
- N. Bell and M. Garland, ''Efficient sparse matrix-vector multiplication on CUDA,'' NVIDIA Corporation, NVIDIA Technical Report NVR-2008-004, Dec. 2008.Google Scholar
Index Terms
- Fast and memory-efficient GPU implementations of krylov subspace methods for efficient power grid analysis
Recommendations
Multi-GPU DGEMM and High Performance Linpack on Highly Energy-Efficient Clusters
High Performance Linpack can maximize requirements throughout a computer system. An efficient multi-GPU double-precision general matrix multiply (DGEMM), together with adjustments to the HPL, is required to utilize a heterogeneous computer to its full ...
Analyzing memory management methods on integrated CPU-GPU systems
ISMM '17Heterogeneous systems that integrate a multicore CPU and a GPU on the same die are ubiquitous. On these systems, both the CPU and GPU share the same physical memory as opposed to using separate memory dies. Although integration eliminates the need to ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Comments