research-article

Fast and memory-efficient GPU implementations of krylov subspace methods for efficient power grid analysis

Authors:
Takumi Morishita

Kyoto University, Kyoto, Japan

Kyoto University, Kyoto, Japan
View Profile

,
Hiroshi Tsutsui

Kyoto University, Kyoto, Japan

Kyoto University, Kyoto, Japan
View Profile

,
Hiroyuki Ochi

Kyoto University, Kyoto, Japan

Kyoto University, Kyoto, Japan
View Profile

,
Takashi Sato

Kyoto University, Kyoto, Japan

Kyoto University, Kyoto, Japan
View Profile

GLSVLSI '13: Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSIMay 2013Pages 95–100https://doi.org/10.1145/2483028.2483069

Published:02 May 2013Publication History

GLSVLSI '13: Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI

Pages 95–100

ABSTRACT

Power grid analysis for modern LSI is computationally challenging in terms of both runtime and memory usage. In this paper, we implement Krylov subspace based linear circuit solvers on a graphics processing unit (GPU) to realize fast power grid analysis. Efficiencies of memory space and access performance are pursued by improving a data structure that stores elements of large sparse matrices. Experimental results on benchmark circuits show that the proposed data structures are more suitable than widely used compressed sparse row (CSR) format and our GPU implementations can achieve up to 17x speedup over CPU implementations.

References

J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, G. Mittal, E. Chan, Y. Chan, D. Plass, S. Chu, H. Le, L. Clark, J. Ripley, S. Taylor, J. Dilullo, and M. Lanzerotti, ''Design of the Power6 microprocessor,'' in ISSCC Dig. Tech. Papers, Feb. 2007, pp. 96--97.Google Scholar
S. R. Nassif, ''Power grid analysis benchmarks,'' in Proc. ASPDAC, Jan. 2008, pp. 376--381. Google ScholarDigital Library
M. R. Hestenes and E. Stiefel, ''Methods of conjugate gradients for solving linear systems,'' J. Res. Natl. Bur. Stand., vol. 49, no. 6, pp. 409--436, Dec. 1952.Google ScholarCross Ref
Y. Saad and M. H. Schultz, ''GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems,'' SIAM J. Sci. and Stat. Comput., vol. 7, pp. 856--869, Jul. 1986. Google ScholarDigital Library
J. Shi, Y. Cai, W. Hou, L. Ma, S.-D. Tan, P.-H. Ho, and X. Wang, ''GPU friendly fast Poisson solver for structured power grid network analysis,'' in Proc. DAC, Jul. 2009, pp. 178--183. Google ScholarDigital Library
Z. Feng and P. Li, ''Multigrid on GPU: Tackling power grid analysis on parallel SIMT platforms,'' in Proc. ICCAD, Nov. 2008, pp. 647--654. Google ScholarDigital Library
Z. Feng, X. Zhao, and Z. Zeng, ''Robust parallel preconditioned power grid simulation on GPU with adaptive runtime performance modeling and optimization,'' IEEE Trans. CAD, vol. 30, no. 4, pp. 562--573, Apr. 2011. Google ScholarDigital Library
NVIDIA, ''CUDA Zone,'' http://www.nvidia.com/object/cuda_home_new.html.Google Scholar
C.-H. Chou, N.-Y. Tsai, H. Yu, C.-R. Lee, Y. Shi, and S.-C. Chang, ''On the preconditioner of conjugate gradient method - a power grid simulation perspective,'' in Proc. ICCAD, Nov. 2011, pp. 494--497. Google ScholarDigital Library
N. Bell and M. Garland, ''Efficient sparse matrix-vector multiplication on CUDA,'' NVIDIA Corporation, NVIDIA Technical Report NVR-2008-004, Dec. 2008.Google Scholar

Index Terms

Fast and memory-efficient GPU implementations of krylov subspace methods for efficient power grid analysis

Recommendations

Multi-GPU DGEMM and High Performance Linpack on Highly Energy-Efficient Clusters

High Performance Linpack can maximize requirements throughout a computer system. An efficient multi-GPU double-precision general matrix multiply (DGEMM), together with adjustments to the HPL, is required to utilize a heterogeneous computer to its full ...
Read More
Analyzing memory management methods on integrated CPU-GPU systems
ISMM '17

Heterogeneous systems that integrate a multicore CPU and a GPU on the same die are ubiquitous. On these systems, both the CPU and GPU share the same physical memory as opposed to using separate memory dies. Although integration eliminates the need to ...
Read More
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GLSVLSI '13: Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI
May 2013
368 pages
ISBN:9781450320320
DOI:10.1145/2483028
General Chair:
Jose L. Ayala
Complutense University of Madrid, Spain
,
Program Chairs:
Alex Jones
University of Pittsburgh, USA
,
Patrick Madden
Binghamton University, USA
,
Publications Chair:
Ayse K. Coskun
Boston University, USA
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
gpgpu
krylov subspace method
power grid analysis
Qualifiers
- research-article
Conference

Acceptance Rates
GLSVLSI '13 Paper Acceptance Rate76of238submissions,32%Overall Acceptance Rate312of1,156submissions,27%
More
Upcoming Conference
GLSVLSI '24

Sponsor:

sigda

Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

Clearwater , FL , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 110
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast and memory-efficient GPU implementations of krylov subspace methods for efficient power grid analysis

GLSVLSI '13: Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-GPU DGEMM and High Performance Linpack on Highly Energy-Efficient Clusters

Analyzing memory management methods on integrated CPU-GPU systems

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing