skip to main content
10.1145/2145816.2145844acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

GKLEE: concolic verification and test generation for GPUs

Published: 25 February 2012 Publication History

Abstract

Programs written for GPUs often contain correctness errors such as races, deadlocks, or may compute the wrong result. Existing debugging tools often miss these errors because of their limited input-space and execution-space exploration. Existing tools based on conservative static analysis or conservative modeling of SIMD concurrency generate false alarms resulting in wasted bug-hunting. They also often do not target performance bugs (non-coalesced memory accesses, memory bank conflicts, and divergent warps). We provide a new framework called GKLEE that can analyze C++ GPU programs, locating the aforesaid correctness and performance bugs. For these programs, GKLEE can also automatically generate tests that provide high coverage. These tests serve as concrete witnesses for every reported bug. They can also be used for downstream debugging, for example to test the kernel on the actual hardware. We describe the architecture of GKLEE, its symbolic virtual machine model, and describe previously unknown bugs and performance issues that it detected on commercial SDK kernels. We describe GKLEE's test-case reduction heuristics, and the resulting scalability improvement for a given coverage target.

References

[1]
G. Li and G. Gopalakrishnan, "Scalable SMT-based verification of GPU kernel functions," in SIGSOFT FSE, 2010.
[2]
M. Zheng, V. T. Ravi, F. Qin, and G. Agrawal, "GRace: A low-overhead mechanism for detecting data races in GPU programs," in PPoPP, 2011.
[3]
M. Boyer, K. Skadron, and W. Weimer, "Automated dynamic analysis of CUDA programs," in Third Workshop on Software Tools for MultiCore Systems,2008.
[4]
C. Cadar, D. Dunbar, and D. R. Engler, "KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs," in OSDI, 8th USENIX Symposium, 2008.
[5]
"SMT-COMP. http://www.smtcomp.org/2011."
[6]
P. Godefroid, N. Klarlund, and K. Sen, "DART: Directed automated random testing," in PLDI, 2005.
[7]
"KLEE open projects," http://klee.llvm.org/OpenProjects.html.
[8]
K. Sen, D. Marinov, and G. Agha, "CUTE: a concolic unit testing engine for C," in 10th ESEC/FSE, 2005.
[9]
"CUDA zone. www.nvidia.com/object/cuda_home.html."
[10]
OpenCL. http://www.khronos.org/opencl.
[11]
A. Kamil and K. A. Yelick, "Concurrency Analysis for Parallel Programs with Textually Aligned Barriers," in LCPC, 2005.
[12]
"The LLVM compiler infrastructure. http://www.llvm.org/."
[13]
"GKLEE Technical Report. http://www.cs.utah.edu/fv/GKLEE."
[14]
"Cuda programming guide version 4.0. http://developer.download.nvidia.com/compute/cuda/4_0/toolkit/docs/CUDA_C_Programming_Guide.pdf."
[15]
J. Sevcik, "Safe Optimisations for Shared-Memory Concurrent Programs," in PLDI, 2011.
[16]
S. V. Adve, M. D. Hill, B. P. Miller, and R. H. Netzer, "Detecting data races on weak memory systems," in ISCA, 1991.
[17]
D. Shasa and M. Snir, "Efficient and correct execution of parallel programs that share memory," ACM TOPLAS, vol. 10, no. 2, pp. 282--312, 1988.
[18]
A. Aiken and D. Gay, "Barrier inference," in POPL, 1998.
[19]
NVIDIA, "CUDA-GDB," Jan. 2009, an extension to the GDB debugger for debugging CUDA kernels in the hardware.
[20]
Nvidia, "Parallel Nsight," Jul. 2010.
[21]
Rogue Wave, "Totalview for CUDA," Jan. 2010.
[22]
J. M. Cobleigh, L. A. Clarke, and L. J. Osterweil, "Flavers: A finite state verification technique for software systems," IBM Systems Journal,vol. 41, no. 1, 2002.
[23]
S. K. Lahiri, S. Qadeer, and Z. Rakamaric, "Static and precise detection of concurrency errors in systems code using SMT solvers," in 21st Computer Aided Verification (CAV), 2009.
[24]
B. Coutinho, D. Sampaio, F. M. Quintao Pereira, and W. Meira Jr., "Divergence analysis and optimizations," in PACT, 2011.
[25]
J. Lv, G. Li, A. Humphrey, and G. Gopalakrishnan, "Performance degradation analysis of GPU kernels," in EC2 Workshop, 2011.
[26]
P. Collingbourne, C. Cadar, and P. H. J. Kelly, "Symbolic crosschecking of floating-point and SIMD code," in EuroSys, 2011.
[27]
P. Collingbourne, C. Cadar, and P. Kelly, "Symbolic testing of OpenCL code," in Haifa Verification Conference (HVC), 2011.
[28]
G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems," in PACT, 2010.

Cited By

View all
  • (2024)Indigo3: A Parallel Graph Analytics Benchmark Suite for Exploring Implementation Styles and Common BugsACM Transactions on Parallel Computing10.1145/366525111:3(1-29)Online publication date: 15-May-2024
  • (2024)Descend: A Safe GPU Systems Programming LanguageProceedings of the ACM on Programming Languages10.1145/36564118:PLDI(841-864)Online publication date: 20-Jun-2024
  • (2024)Modeling and Analyzing Evaluation Cost of CUDA KernelsACM Transactions on Parallel Computing10.1145/363940311:1(1-53)Online publication date: 12-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
February 2012
352 pages
ISBN:9781450311601
DOI:10.1145/2145816
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 47, Issue 8
    PPOPP '12
    August 2012
    334 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2370036
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CUDA
  2. GPU
  3. automatic test generation
  4. formal verification
  5. parallelism
  6. symbolic execution
  7. virtual machine

Qualifiers

  • Research-article

Conference

PPoPP '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)100
  • Downloads (Last 6 weeks)17
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Indigo3: A Parallel Graph Analytics Benchmark Suite for Exploring Implementation Styles and Common BugsACM Transactions on Parallel Computing10.1145/366525111:3(1-29)Online publication date: 15-May-2024
  • (2024)Descend: A Safe GPU Systems Programming LanguageProceedings of the ACM on Programming Languages10.1145/36564118:PLDI(841-864)Online publication date: 20-Jun-2024
  • (2024)Modeling and Analyzing Evaluation Cost of CUDA KernelsACM Transactions on Parallel Computing10.1145/363940311:1(1-53)Online publication date: 12-Jan-2024
  • (2024)System-on-Chip Information Flow Validation Under Asynchronous ResetsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.337659643:9(2825-2838)Online publication date: Sep-2024
  • (2024)HiRace: Accurate and Fast Data Race Checking for GPU ProgramsSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00042(1-14)Online publication date: 17-Nov-2024
  • (2024)Over-Synchronization in GPU Programs2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00064(795-809)Online publication date: 2-Nov-2024
  • (2024)MIMD Programs Execution Support on SIMD Machines: A Holistic SurveyIEEE Access10.1109/ACCESS.2024.337299012(34354-34377)Online publication date: 2024
  • (2024)Structural testing for CUDA programming modelConcurrency and Computation: Practice and Experience10.1002/cpe.810536:14Online publication date: 9-Apr-2024
  • (2023)Understanding the Topics and Challenges of GPU Programming by Classifying and Analyzing Stack Overflow PostsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616365(1444-1456)Online publication date: 30-Nov-2023
  • (2022)The Indigo Program-Verification Microbenchmark Suite of Irregular Parallel Code Patterns2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS55109.2022.00003(24-34)Online publication date: May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media