ABSTRACT
Device model evaluation is one of the most time-consuming tasks in analog simulators such as SPICE. Graphics Processing Unit (GPU) architectures allow massive utilization of vector data on SIMD architectures. In this paper, the formulation of double precision device model equations into a form compatible with stream computing is presented. We show data on isolating typical bottlenecks, especially the communication and kernel call overheads. Our results indicate speedup of up to 20X when counting overheads, and up to 50X when using techniques to overcome these overheads. In particular, we show that our techniques are valid for small device counts, which is typically a well known problem for accelerated parallel computing with communications overheads.
- Pillage, L. T., Rohrer, R. A., Visweswariah, C., Electronic Circuit and System Simulation Methods, (1995), McGraw-Hill. Google ScholarDigital Library
- Cox, P. F., Burch, R. G., Hocevar, D. E., Yang, P., and Epler, B. D., Direct Circuit Simulation Algorithms for Parallel Processing", IEEE Trans. on Computer-Aided Design, Vol. 10, no. 6. (June 1991), 714--725.Google ScholarDigital Library
- Brook+ Language Specifcation Version 1.0 Beta, (2008).Google Scholar
- AMD Compute Abstraction Layer Programming Guide, version 1-0, (2008).Google Scholar
- Sadayappan, P., and Visvanathan, V., Efficient Sparse Matrix Factorization for Circuit Simulation on Vector Supercomputers, 26th Design Automation Conference, (June 1989), 13--18. Google ScholarDigital Library
- Garland, M., Sparse Matrix Computations on Manycore GPU's, 45th Design Automation Conference, (June 2008), 2--6. Google ScholarDigital Library
- AMD Stream Computing Website at: http://ati.amd.com/products/streamprocessor/specs.htmlGoogle Scholar
- Predictive Technology Model Official website at: http://www.eas.asu.edu/~ptm/Google Scholar
- Official OpenMP website at: http://www.openmp.orgGoogle Scholar
- AMD Stream Computing User Guide, Rev. 1.1, (Aug. 2008).Google Scholar
Index Terms
- Massive parallelization of SPICE device model evaluation on GPU-based SIMD architectures
Recommendations
A Comparative Evaluation of Parallel Programming Models for Shared-Memory Architectures
ISPA '12: Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with ApplicationsNowadays, most computers that are commercially available off-the-shelf (COTS) include hardware features that increase the performance of parallel general-purpose threads (hyper threading, multicore, ccNUMA architectures) or SIMD kernels (CPU vector ...
A performance study of general-purpose applications on graphics processors using CUDA
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
On-GPU Thread-Data Remapping for Branch Divergence Reduction
General Purpose GPU computing (GPGPU) plays an increasingly vital role in high performance computing and other areas like deep learning. However, arising from the SIMD execution model, the branch divergence issue lowers efficiency of conditional ...
Comments