Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

Chiang, Wei-Fan; Gopalakrishnan, Ganesh; Li, Guodong; Rakamarić, Zvonimir

doi:10.1007/978-3-642-38088-4_15

Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

Wei-Fan Chiang¹⁹,
Ganesh Gopalakrishnan¹⁹,
Guodong Li²⁰ &
…
Zvonimir Rakamarić¹⁹

Conference paper

1342 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7871))

Abstract

GPU based computing has made significant strides in recent years. Unfortunately, GPU program optimizations can introduce subtle concurrency errors, and so incisive formal bug-hunting methods are essential. This paper presents a new formal bug-hunting method for GPU programs that combine barriers and atomics. We present an algorithm called c onflict-directed d elay-bounded scheduling algorithm (CD) that exploits the occurrence of conflicts among atomic synchronization commands to trigger the generation of alternate schedules; these alternate schedules are executed in a delay-bounded manner. We formally describe CD, and present two correctness checking methods, one based on final state comparison, and the other on user assertions. We evaluate our implementation on realistic GPU benchmarks, with encouraging results.

Supported by NSF CCF 1255776, OCI 1148127, and the Microsoft SEIF Award.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Attiya, H., Guerraoui, R., Hendler, D., Kuznetsov, P., Michael, M.M., Vechev, M.: Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. In: POPL, pp. 487–498 (2011)
Google Scholar
http://www.cs.utah.edu/fv/CdDb
Betts, A., Chong, N., Donaldson, A.F., Qadeer, S., Thomson, P.: GPUVerify: a verifier for GPU kernels. In: OOPSLA, pp. 113–132 (2012)
Google Scholar
Cadar, C., Dunbar, D., Engler, D.R.: KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI (2008)
Google Scholar
Clarke, E.M., Grumberg, O., Peled, D.A.: Model Cheking. MIT Press (1999)
Google Scholar
Collingbourne, P., Cadar, C., Kelly, P.H.J.: Symbolic testing of openCL code. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS, vol. 7261, pp. 203–218. Springer, Heidelberg (2012)
Chapter Google Scholar
Debattista, K., Dubla, P., dos Santos, L.P.P., Chalmers, A.: Wait-free shared-memory irradiance caching. Comp. Graphics and Applications 31(5), 66–78 (2011)
Article Google Scholar
Emmi, M., Qadeer, S., Rakamarić, Z.: Delay-bounded scheduling. In: POPL, pp. 411–422 (2011)
Google Scholar
Flanagan, C., Godefroid, P.: Dynamic partial-order reduction for model checking software. In: POPL, pp. 110–121 (2005)
Google Scholar
Goetz, B., Bloch, J., Bowbeer, J., Lea, D., Holmes, D., Peierls, T.: Java Concurrency in Practice. Addison-Wesley Longman, Amsterdam (2006)
Google Scholar
Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming (2008)
Google Scholar
Hwu, W.-M.W.: GPU Computing Gems Emerald Edition (2011)
Google Scholar
Lal, A., Reps, T.: Reducing concurrent analysis under a context bound to sequential analysis. Form. Methods Syst. Des. 35(1), 73–97 (2009)
Article MATH Google Scholar
Leung, A., Gupta, M., Agarwal, Y., Gupta, R., Jhala, R., Lerner, S.: Verifying GPU kernels by test amplification. In: PLDI, pp. 383–394 (2012)
Google Scholar
Li, G., Gopalakrishnan, G.: Scalable SMT-based verification of GPU kernel functions. In: FSE, pp. 187–196 (2010)
Google Scholar
Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.: GKLEE technical report, http://www.cs.utah.edu/fv/GKLEE/gklee_tr.pdf
Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.P.: GKLEE: concolic verification and test generation for GPUs. In: PPOPP (2012)
Google Scholar
Li, P., Li, G., Gopalakrishnan, G.: Parametric flows: Automated behavior equivalencing for symbolic analysis of races in CUDA programs. In: SC (2012)
Google Scholar
Méndez-Lojo, M., Burtscher, M., Pingali, K.: A GPU implementation of inclusion-based points-to analysis. In: PPOPP, pp. 107–116 (2012)
Google Scholar
Musuvathi, M., Qadeer, S.: Iterative context bounding for systematic testing of multithreaded programs. In: PLDI, pp. 446–455 (2007)
Google Scholar
CUDA implementation of the tree-based Barnes-Hut N-body algorithm, http://www.gpucomputing.net/?q=node/1314
Nguyen, H.: GPU Gems 3, 1st edn. Addison-Wesley Professional (2007)
Google Scholar
Nvidia. CUDA parallel computing platform, http://www.nvidia.com/object/cuda_home_new.html
OpenCL. OpenCL - the open standard for parallel programming of heterogeneous systems, http://www.khronos.org/opencl
Sen, K.: Race directed random testing of concurrent programs. In: PLDI (2008)
Google Scholar
Sen, K., Agha, G.: A race-detection and flipping algorithm for automated testing of multi-threaded programs. In: Bin, E., Ziv, A., Ur, S. (eds.) HVC 2006. LNCS, vol. 4383, pp. 166–182. Springer, Heidelberg (2007)
Chapter Google Scholar
Sorin, D.J., Hill, M.D., Wood, D.A.: A Primer on Memory Consistency and Cache Coherence (2011)
Google Scholar
http://www.cs.txstate.edu/~burtscher/research/TSP_GPU/
Zheng, M., Ravi, V.T., Qin, F., Agrawal, G.: GRace: a low-overhead mechanism for detecting data races in GPU programs. In: PPOPP, pp. 135–146 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Utah, USA
Wei-Fan Chiang, Ganesh Gopalakrishnan & Zvonimir Rakamarić
Fujitsu Labs of America, USA
Guodong Li

Authors

Wei-Fan Chiang
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh Gopalakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Li
View author publications
You can also search for this author in PubMed Google Scholar
Zvonimir Rakamarić
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NASA Ames Research Center, M/S 269-2, Moffett Field, 94035-0001, CA, USA
Guillaume Brat
Stinger Ghaffarian Technologies Inc., NASA Ames Research Center, M/S 269-2, Moffett Field, 94035-0001, CA, USA
Neha Rungta
NASA Ames Research Center, Carnegie Mellon University, M/S 269-2, Moffett Field, 94035-0001, CA, USA
Arnaud Venet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chiang, WF., Gopalakrishnan, G., Li, G., Rakamarić, Z. (2013). Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding. In: Brat, G., Rungta, N., Venet, A. (eds) NASA Formal Methods. NFM 2013. Lecture Notes in Computer Science, vol 7871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38088-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-38088-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38087-7
Online ISBN: 978-3-642-38088-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics