research-article

GKLEE: concolic verification and test generation for GPUs

Authors:

Ganesh Gopalakrishnan,

Indradeep Ghosh,

Sreeranga P. RajanAuthors Info & Claims

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

Pages 215 - 224

https://doi.org/10.1145/2145816.2145844

Published: 25 February 2012 Publication History

Abstract

Programs written for GPUs often contain correctness errors such as races, deadlocks, or may compute the wrong result. Existing debugging tools often miss these errors because of their limited input-space and execution-space exploration. Existing tools based on conservative static analysis or conservative modeling of SIMD concurrency generate false alarms resulting in wasted bug-hunting. They also often do not target performance bugs (non-coalesced memory accesses, memory bank conflicts, and divergent warps). We provide a new framework called GKLEE that can analyze C++ GPU programs, locating the aforesaid correctness and performance bugs. For these programs, GKLEE can also automatically generate tests that provide high coverage. These tests serve as concrete witnesses for every reported bug. They can also be used for downstream debugging, for example to test the kernel on the actual hardware. We describe the architecture of GKLEE, its symbolic virtual machine model, and describe previously unknown bugs and performance issues that it detected on commercial SDK kernels. We describe GKLEE's test-case reduction heuristics, and the resulting scalability improvement for a given coverage target.

References

[1]

G. Li and G. Gopalakrishnan, "Scalable SMT-based verification of GPU kernel functions," in SIGSOFT FSE, 2010.

Digital Library

[2]

M. Zheng, V. T. Ravi, F. Qin, and G. Agrawal, "GRace: A low-overhead mechanism for detecting data races in GPU programs," in PPoPP, 2011.

Digital Library

[3]

M. Boyer, K. Skadron, and W. Weimer, "Automated dynamic analysis of CUDA programs," in Third Workshop on Software Tools for MultiCore Systems,2008.

[4]

C. Cadar, D. Dunbar, and D. R. Engler, "KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs," in OSDI, 8th USENIX Symposium, 2008.

Digital Library

[5]

"SMT-COMP. http://www.smtcomp.org/2011."

[6]

P. Godefroid, N. Klarlund, and K. Sen, "DART: Directed automated random testing," in PLDI, 2005.

Digital Library

[7]

"KLEE open projects," http://klee.llvm.org/OpenProjects.html.

[8]

K. Sen, D. Marinov, and G. Agha, "CUTE: a concolic unit testing engine for C," in 10th ESEC/FSE, 2005.

Digital Library

[9]

"CUDA zone. www.nvidia.com/object/cuda_home.html."

[10]

OpenCL. http://www.khronos.org/opencl.

[11]

A. Kamil and K. A. Yelick, "Concurrency Analysis for Parallel Programs with Textually Aligned Barriers," in LCPC, 2005.

Digital Library

[12]

"The LLVM compiler infrastructure. http://www.llvm.org/."

[13]

"GKLEE Technical Report. http://www.cs.utah.edu/fv/GKLEE."

[14]

"Cuda programming guide version 4.0. http://developer.download.nvidia.com/compute/cuda/4_0/toolkit/docs/CUDA_C_Programming_Guide.pdf."

[15]

J. Sevcik, "Safe Optimisations for Shared-Memory Concurrent Programs," in PLDI, 2011.

Digital Library

[16]

S. V. Adve, M. D. Hill, B. P. Miller, and R. H. Netzer, "Detecting data races on weak memory systems," in ISCA, 1991.

Digital Library

[17]

D. Shasa and M. Snir, "Efficient and correct execution of parallel programs that share memory," ACM TOPLAS, vol. 10, no. 2, pp. 282--312, 1988.

Digital Library

[18]

A. Aiken and D. Gay, "Barrier inference," in POPL, 1998.

Digital Library

[19]

NVIDIA, "CUDA-GDB," Jan. 2009, an extension to the GDB debugger for debugging CUDA kernels in the hardware.

[20]

Nvidia, "Parallel Nsight," Jul. 2010.

[21]

Rogue Wave, "Totalview for CUDA," Jan. 2010.

[22]

J. M. Cobleigh, L. A. Clarke, and L. J. Osterweil, "Flavers: A finite state verification technique for software systems," IBM Systems Journal,vol. 41, no. 1, 2002.

Digital Library

[23]

S. K. Lahiri, S. Qadeer, and Z. Rakamaric, "Static and precise detection of concurrency errors in systems code using SMT solvers," in 21st Computer Aided Verification (CAV), 2009.

Digital Library

[24]

B. Coutinho, D. Sampaio, F. M. Quintao Pereira, and W. Meira Jr., "Divergence analysis and optimizations," in PACT, 2011.

Digital Library

[25]

J. Lv, G. Li, A. Humphrey, and G. Gopalakrishnan, "Performance degradation analysis of GPU kernels," in EC2 Workshop, 2011.

[26]

P. Collingbourne, C. Cadar, and P. H. J. Kelly, "Symbolic crosschecking of floating-point and SIMD code," in EuroSys, 2011.

Digital Library

[27]

P. Collingbourne, C. Cadar, and P. Kelly, "Symbolic testing of OpenCL code," in Haifa Verification Conference (HVC), 2011.

Digital Library

[28]

G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems," in PACT, 2010.

Digital Library

Cited By

Liu YAzami NVanausdal ABurtscher M(2024)Indigo3: A Parallel Graph Analytics Benchmark Suite for Exploring Implementation Styles and Common BugsACM Transactions on Parallel Computing10.1145/366525111:3(1-29)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3665251
Köpcke BGorlatch SSteuwer M(2024)Descend: A Safe GPU Systems Programming LanguageProceedings of the ACM on Programming Languages10.1145/36564118:PLDI(841-864)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656411
Muller SHoffmann J(2024)Modeling and Analyzing Evaluation Cost of CUDA KernelsACM Transactions on Parallel Computing10.1145/363940311:1(1-53)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3639403
Show More Cited By

Index Terms

GKLEE: concolic verification and test generation for GPUs
1. General and reference
  1. Cross-computing tools and techniques
    1. Validation
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Empirical software validation
      2. Process validation

Recommendations

GKLEE: concolic verification and test generation for GPUs
PPOPP '12

Programs written for GPUs often contain correctness errors such as races, deadlocks, or may compute the wrong result. Existing debugging tools often miss these errors because of their limited input-space and execution-space exploration. Existing tools ...
Practical symbolic race checking of GPU programs
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Even the careful GPU programmer can inadvertently introduce data races while writing and optimizing code. Currently available GPU race checking methods fall short either in terms of their formal guarantees, ease of use, or practicality. Existing ...
Out-of-core implementation for accelerator kernels on heterogeneous clouds

Cloud environments today are increasingly featuring hybrid nodes containing multicore CPU processors and a diverse mix of accelerators such as Graphics Processing Units (GPUs), Intel Xeon Phi co-processors, and Field-Programmable Gate Arrays (FPGAs) to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

February 2012

352 pages

ISBN:9781450311601

DOI:10.1145/2145816

General Chair:
J. Ramanujam
Louisiana State University, USA
,
Program Chair:
P. Sadayappan
The Ohio State University, USA

ACM SIGPLAN Notices Volume 47, Issue 8
PPOPP '12
August 2012
334 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2370036
Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PPoPP '12

Sponsor:

SIGPLAN

PPoPP '12: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 25 - 29, 2012

Louisiana, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

120
Total Citations
View Citations
833
Total Downloads

Downloads (Last 12 months)100
Downloads (Last 6 weeks)17

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YAzami NVanausdal ABurtscher M(2024)Indigo3: A Parallel Graph Analytics Benchmark Suite for Exploring Implementation Styles and Common BugsACM Transactions on Parallel Computing10.1145/366525111:3(1-29)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3665251
Köpcke BGorlatch SSteuwer M(2024)Descend: A Safe GPU Systems Programming LanguageProceedings of the ACM on Programming Languages10.1145/36564118:PLDI(841-864)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656411
Muller SHoffmann J(2024)Modeling and Analyzing Evaluation Cost of CUDA KernelsACM Transactions on Parallel Computing10.1145/363940311:1(1-53)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3639403
Miftah SRaj KMeng XRay SBasu K(2024)System-on-Chip Information Flow Validation Under Asynchronous ResetsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.337659643:9(2825-2838)Online publication date: Sep-2024
https://doi.org/10.1109/TCAD.2024.3376596
Jacobson JBurtscher MGopalakrishnan G(2024)HiRace: Accurate and Fast Data Race Checking for GPU ProgramsSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00042(1-14)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SC41406.2024.00042
Nayak ABasu A(2024)Over-Synchronization in GPU Programs2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00064(795-809)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00064
Mustafa DAlkhasawneh RObeidat FShatnawi A(2024)MIMD Programs Execution Support on SIMD Machines: A Holistic SurveyIEEE Access10.1109/ACCESS.2024.337299012(34354-34377)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3372990
Luz HSouza PSouza S(2024)Structural testing for CUDA programming modelConcurrency and Computation: Practice and Experience10.1002/cpe.810536:14Online publication date: 9-Apr-2024
https://doi.org/10.1002/cpe.8105
Yang WZhang CPan MChandra SBlincoe KTonella P(2023)Understanding the Topics and Challenges of GPU Programming by Classifying and Analyzing Stack Overflow PostsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616365(1444-1456)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616365
Liu YAzami NWalters CBurtscher M(2022)The Indigo Program-Verification Microbenchmark Suite of Irregular Parallel Code Patterns2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS55109.2022.00003(24-34)Online publication date: May-2022
https://doi.org/10.1109/ISPASS55109.2022.00003
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten