skip to main content
10.1145/3240765.3240771guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

A Formal Instruction-level GPU Model for Scalable Verification

Published: 05 November 2018 Publication History

Abstract

GPUs have been widely used to accelerate big-data inference applications and scientific computing through their parallelized hardware resources and programming model. Their extreme parallelism increases the possibility of bugs such as data races and un-coalesced memory accesses, and thus verifying program correctness is critical. State-of-the-art GPU program verification efforts mainly focus on analyzing application-level programs, e.g., in C, and suffer from the following limitations: (1) high false-positive rate due to coarse-grained abstraction of synchronization primitives, (2) high complexity of reasoning about pointer arithmetic, and (3) keeping up with an evolving API for developing application-level programs. In this paper, we address these limitations by modeling GPUs and reasoning about programs at the instruction level. We formally model the Nvidia GPU at the parallel execution thread (PTX) level using the recently proposed Instruction-Level Abstraction (ILA) model for accelerators. PTX is analogous to the Instruction-Set Architecture (ISA) of a general-purpose processor. Our formal ILA model of the GPU includes non-synchronization instructions as well as all synchronization primitives, enabling us to verify multithreaded programs. We demonstrate the applicability of our ILA model in scalable GPU program verification of data-race checking. The evaluation shows that our checker outperforms state-of-the-art GPU data race checkers with fewer false-positives and improved scalability.

References

[1]
Rajeev Alur, Joseph Devietti, Omar S Navarro Leija, and Nimit Singhania. 2017. GPUDrano: Detecting Uncoalesced Accesses in GPU Programs. In International Conference on Computer Aided Verification. Springer, 507–525.
[2]
Ali Bakhoda, George L Yuan, Wilson WL Fung, Henry Wong, and Tor M Aamodt. 2009. Analyzing CUDA workloads using a detailed GPU simulator. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on. IEEE, 163–174.
[3]
Adam Betts, Nathan Chong, Alastair Donaldson, Shaz Qadeer, and Paul Thomson. 2012. GPUVerify: a verifier for GPU kernels. In ACM SIGPLAN Notices, Vol. 47. ACM, 113–132.
[4]
Armin Biere, Alessandro Cimatti, Edmund M Clarke, Ofer Strichman, Yunshan Zhu, et al. 2003. Bounded model checking. Advances in computers 58, 11 (2003), 117–148.
[5]
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on. Ieee, 44–54.
[6]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340.
[7]
Ariel Eizenberg, Yuanfeng Peng, Toma Pigli, William Mansky, and Joseph Devietti. 2017. BARRACUDA: Binary-level Analysis of Runtime RAces in CUDA programs. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 126–140.
[8]
Wilson WL Fung, Ivan Sham, George Yuan, and Tor M Aamodt. 2007. Dynamic warp formation and scheduling for efficient GPU control flow. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 407–420.
[9]
Bo-Yuan Huang Hongce Zhang, Pramod Subramanyan, Yakir Vizel, Aarti Gupta and Sharad Malik. 2018. Instruction-Level Abstraction (ILA): A Uniform Specification for System-on-Chip (SoC) Verification. arXiv preprint arXiv: (2018).
[10]
Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (1978), 558–565.
[11]
Guodong Li and Ganesh Gopalakrishnan. 2010. Scalable SMT-based verification of GPU kernel functions. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. ACM, 187–196.
[12]
Nvidia. 2017. CUDA C Programming Guide 9.1. (2017). https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
[13]
Nvidia. 2017. CUDA Parallel Thread Execution ISA 6.1. (2017). https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
[14]
Nvidia. 2017. CUDA Toolkit 9.1. (2017). https://developer.nvidia.com/cuda-downloads
[15]
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. 2006. Combinatorial Sketching for Finite Programs. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Vol. 41. ACM, 404–415.
[16]
Pramod Subramanyan, Bo-Yuan Huang Yakir Vizel, Aarti Gupta and Sharad Malik. 2017. Template-based Parameterized Synthesis of Uniform Instruction-Level Abstractions for SoC Verification. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2017).
[17]
Rafael Ubal, Byunghyun Jang, Perhaad Mistry, Dana Schaa, and David Kaeli. 2012. Multi2Sim: a simulation framework for CPU-GPU computing. In Parallel Architectures and Compilation Techniques (PACT), 2012 21st International Conference on. IEEE, 335–344.
[18]
Shucai Xiao and Wu-chun Feng. 2010. Inter-block GPU communication via fast barrier synchronization. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on. IEEE, 1–12.
[19]
Mai Zheng, Vignesh T Ravi, Feng Qin, and Gagan Agrawal. 2011. GRace: a low-overhead mechanism for detecting data races in GPU programs. In ACM SIGPLAN Notices, Vol. 46. ACM, 135–146.
[20]
Mai Zheng, Vignesh T Ravi, Feng Qin, and Gagan Agrawal. 2014. Gmrace: Detecting data races in gpu programs via a low-overhead scheme. IEEE Transactions on Parallel and Distributed Systems 25, 1 (2014), 104–115.

Cited By

View all
  • (2024)Deductive Verification of SYCL in VerCorsSoftware Engineering and Formal Methods10.1007/978-3-031-77382-2_11(182-199)Online publication date: 4-Nov-2024
  • (2024)Structural testing for CUDA programming modelConcurrency and Computation: Practice and Experience10.1002/cpe.810536:14Online publication date: 9-Apr-2024
  • (2023)High-Performance Implementation of the Identity-Based Signature Scheme in IEEE P1363 on GPUACM Transactions on Embedded Computing Systems10.1145/356478422:2(1-35)Online publication date: 24-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
Nov 2018
939 pages

Publisher

IEEE Press

Publication History

Published: 05 November 2018

Permissions

Request permissions for this article.

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Deductive Verification of SYCL in VerCorsSoftware Engineering and Formal Methods10.1007/978-3-031-77382-2_11(182-199)Online publication date: 4-Nov-2024
  • (2024)Structural testing for CUDA programming modelConcurrency and Computation: Practice and Experience10.1002/cpe.810536:14Online publication date: 9-Apr-2024
  • (2023)High-Performance Implementation of the Identity-Based Signature Scheme in IEEE P1363 on GPUACM Transactions on Embedded Computing Systems10.1145/356478422:2(1-35)Online publication date: 24-Jan-2023
  • (2022)Instruction mapping techniques for processors with very long instruction word architecturesJournal of Electrical Engineering10.2478/jee-2022-005373:6(387-395)Online publication date: 24-Dec-2022
  • (2020)DELTA: Validate GPU Memory Profiling with MicrobenchmarksProceedings of the International Symposium on Memory Systems10.1145/3422575.3422784(97-104)Online publication date: 28-Sep-2020
  • (2020)Formal Methods for GPGPU Programming: Is the Demand Met?Integrated Formal Methods10.1007/978-3-030-63461-2_9(160-177)Online publication date: 13-Nov-2020
  • (2019)ILAng: A Modeling and Verification Platform for SoCs Using Instruction-Level AbstractionsTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-030-17462-0_21(351-357)Online publication date: 4-Apr-2019

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media