Skip to main content

Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7871))

Abstract

GPU based computing has made significant strides in recent years. Unfortunately, GPU program optimizations can introduce subtle concurrency errors, and so incisive formal bug-hunting methods are essential. This paper presents a new formal bug-hunting method for GPU programs that combine barriers and atomics. We present an algorithm called c onflict-directed d elay-bounded scheduling algorithm (CD) that exploits the occurrence of conflicts among atomic synchronization commands to trigger the generation of alternate schedules; these alternate schedules are executed in a delay-bounded manner. We formally describe CD, and present two correctness checking methods, one based on final state comparison, and the other on user assertions. We evaluate our implementation on realistic GPU benchmarks, with encouraging results.

Supported by NSF CCF 1255776, OCI 1148127, and the Microsoft SEIF Award.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Attiya, H., Guerraoui, R., Hendler, D., Kuznetsov, P., Michael, M.M., Vechev, M.: Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. In: POPL, pp. 487–498 (2011)

    Google Scholar 

  2. http://www.cs.utah.edu/fv/CdDb

  3. Betts, A., Chong, N., Donaldson, A.F., Qadeer, S., Thomson, P.: GPUVerify: a verifier for GPU kernels. In: OOPSLA, pp. 113–132 (2012)

    Google Scholar 

  4. Cadar, C., Dunbar, D., Engler, D.R.: KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI (2008)

    Google Scholar 

  5. Clarke, E.M., Grumberg, O., Peled, D.A.: Model Cheking. MIT Press (1999)

    Google Scholar 

  6. Collingbourne, P., Cadar, C., Kelly, P.H.J.: Symbolic testing of openCL code. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS, vol. 7261, pp. 203–218. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Debattista, K., Dubla, P., dos Santos, L.P.P., Chalmers, A.: Wait-free shared-memory irradiance caching. Comp. Graphics and Applications 31(5), 66–78 (2011)

    Article  Google Scholar 

  8. Emmi, M., Qadeer, S., Rakamarić, Z.: Delay-bounded scheduling. In: POPL, pp. 411–422 (2011)

    Google Scholar 

  9. Flanagan, C., Godefroid, P.: Dynamic partial-order reduction for model checking software. In: POPL, pp. 110–121 (2005)

    Google Scholar 

  10. Goetz, B., Bloch, J., Bowbeer, J., Lea, D., Holmes, D., Peierls, T.: Java Concurrency in Practice. Addison-Wesley Longman, Amsterdam (2006)

    Google Scholar 

  11. Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming (2008)

    Google Scholar 

  12. Hwu, W.-M.W.: GPU Computing Gems Emerald Edition (2011)

    Google Scholar 

  13. Lal, A., Reps, T.: Reducing concurrent analysis under a context bound to sequential analysis. Form. Methods Syst. Des. 35(1), 73–97 (2009)

    Article  MATH  Google Scholar 

  14. Leung, A., Gupta, M., Agarwal, Y., Gupta, R., Jhala, R., Lerner, S.: Verifying GPU kernels by test amplification. In: PLDI, pp. 383–394 (2012)

    Google Scholar 

  15. Li, G., Gopalakrishnan, G.: Scalable SMT-based verification of GPU kernel functions. In: FSE, pp. 187–196 (2010)

    Google Scholar 

  16. Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.: GKLEE technical report, http://www.cs.utah.edu/fv/GKLEE/gklee_tr.pdf

  17. Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.P.: GKLEE: concolic verification and test generation for GPUs. In: PPOPP (2012)

    Google Scholar 

  18. Li, P., Li, G., Gopalakrishnan, G.: Parametric flows: Automated behavior equivalencing for symbolic analysis of races in CUDA programs. In: SC (2012)

    Google Scholar 

  19. Méndez-Lojo, M., Burtscher, M., Pingali, K.: A GPU implementation of inclusion-based points-to analysis. In: PPOPP, pp. 107–116 (2012)

    Google Scholar 

  20. Musuvathi, M., Qadeer, S.: Iterative context bounding for systematic testing of multithreaded programs. In: PLDI, pp. 446–455 (2007)

    Google Scholar 

  21. CUDA implementation of the tree-based Barnes-Hut N-body algorithm, http://www.gpucomputing.net/?q=node/1314

  22. Nguyen, H.: GPU Gems 3, 1st edn. Addison-Wesley Professional (2007)

    Google Scholar 

  23. Nvidia. CUDA parallel computing platform, http://www.nvidia.com/object/cuda_home_new.html

  24. OpenCL. OpenCL - the open standard for parallel programming of heterogeneous systems, http://www.khronos.org/opencl

  25. Sen, K.: Race directed random testing of concurrent programs. In: PLDI (2008)

    Google Scholar 

  26. Sen, K., Agha, G.: A race-detection and flipping algorithm for automated testing of multi-threaded programs. In: Bin, E., Ziv, A., Ur, S. (eds.) HVC 2006. LNCS, vol. 4383, pp. 166–182. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  27. Sorin, D.J., Hill, M.D., Wood, D.A.: A Primer on Memory Consistency and Cache Coherence (2011)

    Google Scholar 

  28. http://www.cs.txstate.edu/~burtscher/research/TSP_GPU/

  29. Zheng, M., Ravi, V.T., Qin, F., Agrawal, G.: GRace: a low-overhead mechanism for detecting data races in GPU programs. In: PPOPP, pp. 135–146 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chiang, WF., Gopalakrishnan, G., Li, G., Rakamarić, Z. (2013). Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding. In: Brat, G., Rungta, N., Venet, A. (eds) NASA Formal Methods. NFM 2013. Lecture Notes in Computer Science, vol 7871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38088-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38088-4_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38087-7

  • Online ISBN: 978-3-642-38088-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics