skip to main content
research-article

High-Throughput Logic Timing Simulation on GPGPUs

Published: 24 June 2015 Publication History

Abstract

Many EDA tasks such as test set characterization or the precise estimation of power consumption, power droop and temperature development, require a very large number of time-aware gate-level logic simulations. Until now, such characterizations have been feasible only for rather small designs or with reduced precision due to the high computational demands.
The new simulation system presented here is able to accelerate such tasks by more than two orders of magnitude and provides for the first time fast and comprehensive timing simulations for industrial-sized designs. Hazards, pulse-filtering, and pin-to-pin delay are supported for the first time in a GPGPU accelerated simulator, and the system can easily be extended to even more realistic delay models and further applications.
A sophisticated mapping with efficient memory utilization and access patterns as well as minimal synchronizations and control flow divergence is able to use the full potential of GPGPU architectures. To provide such a mapping, we combine for the first time the versatility of event-based timing simulation and multi-dimensional parallelism used in GPU-based gate-level simulators. The result is a throughput-optimized timing simulation algorithm, which runs many simulation instances in parallel and at the same time fully exploits gate-parallelism within the circuit.

References

[1]
M. Bailey, J. Briner Jr, and R. Chamberlain. 1994. Parallel logic simulation of VLSI systems. ACM Comput. Surv. 26, 3, 255--294.
[2]
B. Catanzaro, K. Keutzer, and B. Su. 2008. Parallelizing CAD: A timely research agenda for EDA. In Proceedings of the 45th ACM/IEEE Design Automation Conference. IEEE, 12--17.
[3]
K. Chandy and J. Misra. 1981. Asynchronous distributed simulation via a sequence of parallel computations. Commun. ACM 24, 4, 198--206.
[4]
D. Chatterjee, A. Deorio, and V. Bertacco. 2011. Gate-level simulation with GPU computing. ACM Trans. Des. Autom. Electron. Syst. 16, 30:1--30:26.
[5]
J. F. Croix and S. P. Khatri. 2009. Introduction to GPU programming for EDA. In IEEE/ACM International Conference on Computer-Aided Design: Digest of Technical Papers. 276--280.
[6]
A. Czutro, N. Houarche, P. Engelke, I. Polian, M. Comte, M. Renovell, and B. Becker. 2008. A simulator of small-delay faults caused by resistive-open defects. In Proceedings of the 13th IEEE European Test Symposium. 113--118.
[7]
A. Czutro, M. E. Imhof, J. Jiang, A. Mumtaz, M. Sauer, B. Becker, I. Polian, and H.-J. Wunderlich. 2012. Variation-aware fault grading. In Proceedings of the 21st IEEE Asian Test Symposium. 344--349.
[8]
K. Gulati and S. Khatri. 2009. Accelerating statistical static timing analysis using graphics processing units. In Proceedings of the Asia and South Pacific Design Automation Conference. 260--265.
[9]
K. Gulati and S. P. Khatri. 2010. Fault table computation on GPUs. J. Electron. Test. 26, 2, 195--209.
[10]
S. Holst, E. Schneider, and H.-J. Wunderlich. 2012. Scan test power simulation on GPGPUs. In Proceedings of the 21st IEEE Asian Test Symposium. 155--160.
[11]
D. Jefferson. 1985. Virtual time. ACM Trans. Program. Lang. Syst. 7, 3, 404--425.
[12]
K. C. Knowlton. 1965. A fast storage allocator. Commun. ACM 8, 10, 623--624.
[13]
D. E. Knuth. 1969. The Art of Computer Programming Vol. 1: Fundamental Algorithms 2nd Ed. Addison-Wesley.
[14]
M. A. Kochte, M. Schaal, H.-J. Wunderlich, and C. G. Zoellin. 2010. Efficient fault simulation on many-core processors. In Proceedings of the 47th ACM/IEEE Design Automation Conference. 380--385.
[15]
H. Li, D. Xu, Y. Han, K.-T. Cheng, and X. Li. 2010. nGFSIM: A GPU-based fault simulator for 1-to-n detection and its applications. In Proceedings of the IEEE International Test Conference. 12.1/1--12.1/10.
[16]
L. Li, X. Yu, C.-W. Wu, and Y. Min. 2000. A waveform simulator based on Boolean process. In Proceedings of the 9th Asian Test Symposium. 145--150.
[17]
M. Li and M. Hsiao. 2010. FSimGP2: An efficient fault simulator with GPGPU. In Proceedings of the 19th IEEE Asian Test Symposium. 15--20.
[18]
C. J. Lin and S. M. Reddy. 1987. On delay fault testing in logic circuits. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 6, 5, 694--703.
[19]
E. Macii, M. Pedram, and F. Somenzi. 1997. High-level power modeling, estimation, and optimization. In Proceedings of the 34th ACM/IEEE Design Automation Conference. ACM, 504--511.
[20]
S. Meraji and C. Tropper. 2012. Optimizing techniques for parallel digital logic simulation. IEEE Trans. Parallel Distrib. Syst. 23, 6, 1135--1146.
[21]
Y. Min, Z. Zhao, and Z. Li. 1996. An analytical delay model based on Boolean process. In Proceedings of the 9th Conference on VLSI Design. 162--165.
[22]
R. Mueller-Thuns, D. Saab, R. Damiano, and J. Abraham. 1993. VLSI logic and fault simulation on general-purpose parallel computers. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 12, 3, 446--460.
[23]
F. Najm. 1994. A survey of power estimation techniques in VLSI circuits. IEEE Trans. VLSI Syst. 2, 4, 446--455.
[24]
NVIDIA. 2013. NVIDIA CUDA homepage https://developer.nvidia.com/category/zone/cuda-zone.
[25]
OCL. 2011. NanGate Open Cell Library v1.3 http://si2.org/openeda.si2.org/projects/nangatelib.
[26]
J. Owens, M. Houston, D. Luebke, S. Green, J. Stone, and J. Phillips. 2008. GPU computing. Proc. IEEE 96, 5, 879--899.
[27]
S. M. Reddy, M. K. Reddy, and V. D. Agrawal. 1984. Robust tests for stuck-open faults in CMOS combinational logic circuits. In Proceedings of the 14th International Fault-Tolerant Computing Symposium. 4449.
[28]
M. Sauer, I. Polian, M. E. Imhof, A. Mumtaz, E. Schneider, A. Czutro, H.-J. Wunderlich, and B. Becker. 2014. Variation-Aware Deterministic ATPG. In Proceedings of the 19th IEEE European Test Symposium. 87--92. (Best Paper Award.)
[29]
A. Shen, A. Ghosh, S. Devadas, and K. Keutzer. 1992. On average power dissipation and random pattern testability of CMOS combinational logic networks. In Proceedings of the IEEE/ACM International Conference on Computer-aided Design. 402--407.
[30]
L. Soule and A. Gupta. 1989. Parallel distributed-time logic simulation. IEEE Des. Test Comput. 6, 6, 32--48.
[31]
C. Wang and K. Roy. 1998. Maximum power estimation for CMOS circuits using deterministic and statistical approaches. IEEE Trans. VLSI Syst. 6, 1, 134--140.
[32]
Y. Zhu, B. Wang, and Y. Deng. 2011. Massively parallel logic simulation with GPUs. ACM Trans. Des. Autom. Electron. Syst. 16, 3, 29.

Cited By

View all
  • (2024)CPGPUSim: A Multi-dimensional Parallel Acceleration Framework for RTL Simulation2024 2nd International Symposium of Electronics Design Automation (ISEDA)10.1109/ISEDA62518.2024.10618075(272-277)Online publication date: 10-May-2024
  • (2023)GPU-Accelerated Estimation and Targeted Reduction of Peak IR-Drop during Scan Chain ShiftingIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7011E106.D:10(1694-1704)Online publication date: 1-Oct-2023
  • (2023)Mitigating Test-Induced Yield-Loss by IR-Drop-Aware X-Filling2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC60832.2023.00080(501-507)Online publication date: 18-Dec-2023
  • Show More Cited By

Index Terms

  1. High-Throughput Logic Timing Simulation on GPGPUs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 20, Issue 3
    June 2015
    345 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2796316
    • Editor:
    • Naehyuck Chang
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 24 June 2015
    Accepted: 01 January 2015
    Revised: 01 December 2013
    Received: 01 June 2013
    Published in TODAES Volume 20, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Gate-level simulation
    2. general purpose computing on graphics processing unit (GP-GPU)
    3. hazards
    4. parallel CAD
    5. pin-to-pin delay
    6. pulse-filtering
    7. timing simulation

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)40
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CPGPUSim: A Multi-dimensional Parallel Acceleration Framework for RTL Simulation2024 2nd International Symposium of Electronics Design Automation (ISEDA)10.1109/ISEDA62518.2024.10618075(272-277)Online publication date: 10-May-2024
    • (2023)GPU-Accelerated Estimation and Targeted Reduction of Peak IR-Drop during Scan Chain ShiftingIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7011E106.D:10(1694-1704)Online publication date: 1-Oct-2023
    • (2023)Mitigating Test-Induced Yield-Loss by IR-Drop-Aware X-Filling2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC60832.2023.00080(501-507)Online publication date: 18-Dec-2023
    • (2023)Neural Network Compiler for Parallel High-Throughput Simulation of Digital Circuits2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00067(613-623)Online publication date: May-2023
    • (2023)Exploiting the Error Resilience of the Preconditioned Conjugate Gradient Method for Energy and Delay Optimization2023 IEEE 29th International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS59296.2023.10224885(1-7)Online publication date: 3-Jul-2023
    • (2023)Guardband Optimization for the Preconditioned Conjugate Gradient Algorithm2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)10.1109/DSN-W58399.2023.00054(195-198)Online publication date: Jun-2023
    • (2023)General-Purpose Gate-Level Simulation with Partition-Agnostic Parallelism2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247907(1-6)Online publication date: 9-Jul-2023
    • (2022)GATSPIProceedings of the 59th ACM/IEEE Design Automation Conference10.1145/3489517.3530601(1231-1236)Online publication date: 10-Jul-2022
    • (2022)On the Impact of Hardware Timing Errors on Stochastic Computing based Neural Networks2022 IEEE European Test Symposium (ETS)10.1109/ETS54262.2022.9810429(1-6)Online publication date: 23-May-2022
    • (2022)Deep Learning for Power and Switching Activity EstimationMachine Learning Applications in Electronic Design Automation10.1007/978-3-031-13074-8_4(85-114)Online publication date: 10-Aug-2022
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media