skip to main content
10.1145/3178487.3178515acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Public Access

Efficient parallel determinacy race detection for two-dimensional dags

Published: 10 February 2018 Publication History

Abstract

A program is said to have a determinacy race if logically parallel parts of a program access the same memory location and one of the accesses is a write. These races are generally bugs in the program since they lead to non-deterministic program behavior --- different schedules of the program can lead to different results. Most prior work on detecting these races focuses on a subclass of programs with fork-join parallelism.
This paper presents a race-detection algorithm, 2D-Order, for detecting races in a more general class of programs, namely programs whose dependence structure can be represented as planar dags embedded in 2D grids. Such dependence structures arise from programs that use pipelined parallelism or dynamic programming recurrences. Given a computation with T1 work and T span, 2D-Order executes it while also detecting races in O(T1/P + T) time on P processors, which is asymptotically optimal.
We also implemented PRacer, a race-detection algorithm based on 2D-Order for Cilk-P, which is a language for expressing pipeline parallelism. Empirical results demonstrate that PRacer incurs reasonable overhead and exhibits scalability similar to the baseline (executions without race detection) when running on multiple cores.

Supplementary Material

pracer (pracer.zip)
Supplement

References

[1]
Kunal Agrawal, Charles E. Leiserson, and Jim Sukha. 2010. Executing Task Graphs Using Work-Stealing. In 24th IEEE International Parallel and Distributed Processing Symposium. 1--12.
[2]
Todd R. Allen and David A. Padua. 1987. Debugging Fortran on a Shared Memory Machine. In Proceedings of the 1987 International Conference on Parallel Processing. 721--727.
[3]
K. A. Baker, P. C. Fishburn, and F. S. Roberts. 1972. Partial orders of dimension 2. Networks 2, 1 (1972), 11--28.
[4]
Rajkishore Barik, Zoran Budimlić, Vincent Cavè, Sanjay Chatterjee, Yi Guo, David Peixotto, Raghavan Raman, Jun Shirako, Sağnak Taşirlar, Yonghong Yan, Yisheng Zhao, and Vivek Sarkar. 2009. The Habanero Multicore Software Research Project. In Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications. ACM, 735--736.
[5]
Michael A. Bender, Richard Cole, Erik D. Demaine, Martin Farach-Colton, and Jack Zito. 2002. Two Simplified Algorithms for Maintaining Order in a List. In Proceedings of the 10th European Symposium on Algorithms. 152--164.
[6]
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Charles E. Leiserson. 2004. On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs. In 16th Annual ACM Symposium on Parallel Algorithms and Architectures. 133--144.
[7]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques.
[8]
Vincent Cavé, Jisheng Zhao, Jun Shirako, and Vivek Sarkar. 2011. Habanero-Java: the new adventures of old X10. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java. 51--61.
[9]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: An Object-Oriented Approach to Non-Uniform Cluster Computing. In 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. 519--538.
[10]
Jong-Deok Choi, Barton P. Miller, and Robert H. B. Netzer. 1991. Techniques for debugging parallel programs with flowback analysis. ACM Transactions on Programming Languages and Systems 13, 4 (1991), 491--530.
[11]
Charles Consel, Hedi Hamdi, Laurent Réveillère, Lenin Singaravelu, Haiyan Yu, and Calton Pu. 2003. Spidle: a DSL approach to specifying streaming applications. In Proceedings of the 2nd International Conference on Generative Programming and Component Engineering. 1--17.
[12]
John S. Danaher, I-Ting Angelina Lee, and Charles E. Leiserson. 2008. Programming with exceptions in JCilk. Science of Computer Programming 63, 2 (Dec. 2008), 147--171.
[13]
P. Dietz and D. Sleator. 1987. Two Algorithms for Maintaining Order in a List. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing. New York City, 365--372.
[14]
Dimitar Dimitrov, Martin Vechev, and Vivek Sarkar. 2015. Race Detection in Two Dimensions. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, Portland, Oregon, USA, 101--110.
[15]
Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran. 2007. Goldilocks: A Race and Transaction-aware Java Runtime. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, San Diego, California, USA, 245--255.
[16]
Perry A. Emrath and Davis A. Padua. 1988. Automatic Detection of Nondeterminacy in Parallel Programs. In Proceedings of the Workshop on Parallel and Distributed Debugging. 89--99.
[17]
Dawson Engler and Ken Ashcraft. 2003. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles. ACM, Bolton Landing, NY, USA, 237--252.
[18]
Mingdong Feng and Charles E. Leiserson. 1997. Efficient Detection of Determinacy Races in Cilk Programs. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures. 1--11.
[19]
Mingdong Feng and Charles E. Leiserson. 1999. Efficient Detection of Determinacy Races in Cilk Programs. Theory of Computing Systems (1999).
[20]
Jeremy T. Fineman. 2005. Provably Good Race Detection That Runs in Parallel. Master's thesis. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, Cambridge, MA.
[21]
Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: efficient and precise dynamic race detection. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, Dublin, Ireland, 121--133.
[22]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation. ACM, 212--223.
[23]
Michael I. Gordon, William Thies, and Saman Amarasinghe. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 151--162.
[24]
Jialu Huang, Arun Raman, Thomas B. Jablin, Yun Zhang, Tzu-Han Hung, and David I. August. 2010. Decoupled Software Pipelining Creates Parallelization Opportunities. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 121--130.
[25]
Intel Corporation. 2011. Intel® Cilk™Plus. Available from https://www.cilkplus.org/. (2011). Accessed: August 2017.
[26]
Intel Corporation. 2013. Piper: Experimental Language Support for Pipeline Parallelism In Intel® Cilk™Plus. Available from https://www.cilkplus.org/piper-experimental-language-support-pipeline-parallelism-intel-cilk-plus. (2013). Accessed: August 2017.
[27]
I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, and Charles E. Leiserson. 2010. Using Memory Mapping to Support Cactus Stacks in Work-Stealing Runtime Systems. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. ACM, 411--420.
[28]
I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Jim Sukha, and Zhunping Zhang. 2013. On-the-Fly Pipeline Parallelism. In Proceedings of the 25th Annual ACM Symposium on Parallelism in Algorithms and Architectures. 140--151.
[29]
I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Jim Sukha, and Zhunping Zhang. 2015. On-the-Fly Pipeline Parallelism. ACM Transactions on Parallel Computing 2, 3, Article 17 (Sept. 2015), 42 pages.
[30]
I-Ting Angelina Lee and Tao B. Schardl. 2015. Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects. In SPAA '15: Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA '15). ACM, Portland, Oregon, USA, 111--122.
[31]
Charles E. Leiserson. 2010. The Cilk++ Concurrency Platform. Journal of Supercomputing 51, 3 (March 2010), 244--257.
[32]
Steve MacDonald, Duane Szafron, and Jonathan Schaeffer. 2004. Rethinking the Pipeline as Object-Oriented States with Transformations. In 9th International Workshop on High-Level Parallel Programming Models and Supportive Environments at IPDPS. 12--21.
[33]
William R. Mark, R. Steven Glanville, Kurt Akeley, and Mark J. Kilgard. 2003. Cg: a system for programming graphics hardware in a C-like language. In ACM SIGGRAPH. 896--907.
[34]
Michael McCool, Arch D. Robison, and James Reinders. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Elsevier Science.
[35]
John Mellor-Crummey. 1991. On-the-fly Detection of Data Races for Programs with Nested Fork-Join Parallelism. In Proceedings of Supercomputing'91. 24--33.
[36]
John Mellor-Crummey. 1993. Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging. ACM Press, 129--139.
[37]
Barton P. Miller and Jong-Deok Choi. 1988. A Mechanism for Efficient Debugging of Parallel Programs. In Proceedings of the 1988 ACM SIGPLAN Conference on Programming Language Design and Implementation. 135--144.
[38]
Robert H. B. Netzer and Barton P. Miller. 1989. Detecting Data Races in Parallel Program Executions. In In Advances in Languages and Compilers for Parallel Computing, 1990 Workshop. MIT Press, 109--129.
[39]
Robert H. B. Netzer and Barton P. Miller. 1992. What are Race Conditions? ACM Letters on Programming Languages and Systems 1, 1 (March 1992), 74--88.
[40]
Itzhak Nudler and Larry Rudolph. 1986. Tools for the Efficient Development of Efficient Parallel Programs. In Proceedings of the First Israeli Conference on Computer Systems Engineering.
[41]
Robert O'Callahan and Jong-Deok Choi. 2003. Hybrid Dynamic Data Race Detection. In Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '03). ACM, New York, NY, USA, 167--178.
[42]
OpenMP Architecture Review Board. 2013. OpenMP Application Program Interface, Version 4.0. Available from http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf. (2013).
[43]
Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August. 2005. Automatic Thread Extraction with Decoupled Software Pipelining. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 105--118.
[44]
Antoniu Pop and Albert Cohen. 2011. A Stream-computing Extension to OpenMP. In Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. ACM, 5--14.
[45]
Eli Pozniansky and Assaf Schuster 2007. MultiRace: Efficient On-the-fly Data Race Detection in Multithreaded C++ Programs: Research Articles. Concurrency and Computation: Practice and Experience 19, 3 (March 2007), 327--340.
[46]
Easwaran Raman, Guilherme Ottoni, Arun Raman, Matthew J. Bridges, and David I. August. 2008. Parallel-stage Decoupled Software Pipelining. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 114--123.
[47]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2010. Efficient Data Race Detection for Async-Finish Parallelism. In Runtime Verification, Howard Barringer, Ylies Falcone, Bernd Finkbeiner, Klaus Havelund, Insup Lee, Gordon Pace, Grigore Rosu, Oleg Sokolsky, and Nikolai Tillmann (Eds.). Lecture Notes in Computer Science, Vol. 6418. Springer Berlin / Heidelberg, 368--383.
[48]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2012. Scalable and Precise Dynamic Datarace Detection for Structured Parallelism. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 531--542.
[49]
Ram Rangan, Neil Vachharajani, Manish Vachharajani, and David I. August. 2004. Decoupled Software Pipelining with the Synchronization Array. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, 177--188.
[50]
Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. 1995. Run-time Methods for Parallelizing Partially Parallel Loops. In Proceedings of the 9th International Conference on Supercomputing. ACM, 137--146.
[51]
Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. 1995. A scalable method for run-time loop parallelization. International Journal of Parallel Programming 23, 6 (01 Dec. 1995), 537--576.
[52]
Lawrence Rauchwerger and David A. Padua. 1999. The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Transactions on Parallel and Distributed Systems 10, 2 (Feb. 1999), 160--180.
[53]
Daniel Sanchez, David Lo, Richard M. Yoo, Jeremy Sugerman, and Christos Kozyrakis. 2011. Dynamic Fine-Grain Scheduling of Pipeline Parallelism. In 2011 International Conference on Parallel Architectures and Compilation Techniques. IEEE, 22--32.
[54]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A Dynamic Race Detector for Multi-Threaded Programs. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles.
[55]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: Data Race Detection in Practice. In Proceedings of the Workshop on Binary Instrumentation and Applications. ACM, 62--71.
[56]
M. Aater Suleman, Moinuddin K. Qureshi, Khubaib, and Yale N. Patt. 2010. Feedback-directed Pipeline Parallelism. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. ACM, 147--156.
[57]
Rishi Surendran and Vivek Sarkar. 2016. Dynamic Determinacy Race Detection for Task Parallelism with Futures. Springer International Publishing, 368--385.
[58]
Robert Endre Tarjan. 1979. Applications of Path Compression on Balanced Trees. Journal of the Association for Computing Machinery 26, 4 (October 1979), 690--715.
[59]
William Thies, Vikram Chandrasekhar, and Saman Amarasinghe. 2007. A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 356--369.
[60]
Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, and I-Ting Angelina Lee. 2016. Provably Good and Practically Efficient Parallel Race Detection for Fork-Join Programs. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures. 83--94.
[61]
Jacobo Valdes. 1978. Parsing Flowcharts and Series-Parallel Graphs. Ph.D. Dissertation. Stanford University. STAN-CS-78-682.
[62]
Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking. In Proceedings of the Twentieth ACM Symposium on Operating Systems Principles. ACM, 221--234.

Cited By

View all
  • (2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
  • (2021)Efficient Access History for Race DetectionProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461825(449-451)Online publication date: 6-Jul-2021
  • (2021)Efficient Parallel Determinacy Race Detection for Structured FuturesProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461815(398-409)Online publication date: 6-Jul-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2018
442 pages
ISBN:9781450349826
DOI:10.1145/3178487
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 53, Issue 1
    PPoPP '18
    January 2018
    426 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3200691
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication Notes

Badge change: Article originally badged under Version 1.0 guidelines https://www.acm.org/publications/policies/artifact-review-badging

Publication History

Published: 10 February 2018

Permissions

Request permissions for this article.

Check for updates

Badges

Qualifiers

  • Research-article

Funding Sources

Conference

PPoPP '18

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)137
  • Downloads (Last 6 weeks)18
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)PINT: Parallel INTerval-Based Race Detector2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00087(850-861)Online publication date: May-2022
  • (2021)Efficient Access History for Race DetectionProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461825(449-451)Online publication date: 6-Jul-2021
  • (2021)Efficient Parallel Determinacy Race Detection for Structured FuturesProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461815(398-409)Online publication date: 6-Jul-2021
  • (2020)Parallel determinacy race detection for futuresProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374536(217-231)Online publication date: 19-Feb-2020
  • (2019)Processor-Oblivious Record and ReplayACM Transactions on Parallel Computing10.1145/33656596:4(1-28)Online publication date: 17-Dec-2019
  • (2019)Efficient race detection with futuresProceedings of the 24th Symposium on Principles and Practice of Parallel Programming10.1145/3293883.3295732(340-354)Online publication date: 16-Feb-2019
  • (2018)Race Detection in Two DimensionsACM Transactions on Parallel Computing10.1145/32646184:4(1-22)Online publication date: 7-Sep-2018
  • (2018)Towards concurrency race debuggingProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243206(1-13)Online publication date: 1-Nov-2018
  • (2018)Brief AnnouncementProceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures10.1145/3210377.3210658(351-353)Online publication date: 11-Jul-2018
  • (2018)Runtime Determinacy Race Detection for OpenMP TasksEuro-Par 2018: Parallel Processing10.1007/978-3-319-96983-1_3(31-45)Online publication date: 1-Aug-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media