skip to main content
10.1145/2048066.2048087acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Safe parallel programming using dynamic dependence hints

Authors Info & Claims
Published:22 October 2011Publication History

ABSTRACT

Speculative parallelization divides a sequential program into possibly parallel tasks and permits these tasks to run in parallel if and only if they show no dependences with each other. The parallelization is safe in that a speculative execution always produces the same output as the sequential execution.

In this paper, we present the dependence hint, an interface for a user to specify possible dependences between possibly parallel tasks. Dependence hints may be incorrect or incomplete but they do not change the program output. The interface extends Cytron's do-across and recent OpenMP ordering primitives and makes them safe and safely composable. We use it to express conditional and partial parallelism and to parallelize large-size legacy code. The prototype system is implemented as a software library. It is used to improve performance by nearly 10 times on average on current multicore machines for 8 programs including 5 SPEC benchmarks.

References

  1. E. Allen, D. Chase, C. Flood, V. Luchangco, J. Maessen, S. Ryu, and G. L. Steele. Project fortress: a multicore language for multicore processors. Linux Magazine, pages 38--43, September 2007.Google ScholarGoogle Scholar
  2. R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, Oct. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Aviram, S.-C. Weng, S. Hu, and B. Ford. Efficient system-enforced deterministic parallelism. In Proceedings of the Symposium on Operating Systems Design and Implementation, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Basumallik and R. Eigenmann. Optimizing irregular shared-memory applications for distributed-memory systems. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 119--128, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. A. Bender, J. T. Fineman, S. Gilbert, and C. E. Leiserson. On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 133--144, Barcelona, Spain, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 53--64, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: Safe multithreaded programming for C/CGoogle ScholarGoogle Scholar
  8. . In Proceedings of the International Conference on Object Oriented Programming, Systems, Languages and Applications, 2009.Google ScholarGoogle Scholar
  9. S. Burckhardt, A. Baldassin, and D. Leijen. Concurrent programming with revisions and isolation types. In Proceedings of the International Conference on Object Oriented Programming, Systems, Languages and Applications, pages 691--707, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the International Conference on Object Oriented Programming, Systems, Languages and Applications, pages 519--538, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Cytron. Doacross: Beyond vectorization for multiprocessors. In Proceedings of the 1986 International Conference on Parallel Processing, pages 836--844, St. Charles, IL, Aug. 1986.Google ScholarGoogle Scholar
  12. C. Ding. Access annotation for safe speculative parallelization: Semantics and support. Technical Report URCS #966, Department of Computer Science, University of Rochester, March 2011.Google ScholarGoogle Scholar
  13. C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 223--234, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Feng, R. Gupta, and Y. Hu. SpiceC: scalable parallelism via implicit copying and explicit commit. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 69--80, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 212--223, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Heyer, S. Kruglyak, and S. Yooseph. Exploring expression data: Identification and analysis of coexpressed genes. Genome Research, 9:1106--1115, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  17. J. C. Jenista, Y. H. Eom, and B. Demsky. OoOJava: Software out-of-order execution. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 57--68, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Jiang and X. Shen. Adaptive software speculation for enhancing the cost-efficiency of behavior-oriented parallelization. In Proceedings of the International Conference on Parallel Processing, pages 270--278, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Liu and Z. Li. Improving parallelism and locality with asynchronous algorithms. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 213--222, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Luchangco and V. J. Marathe. Transaction communicators: enabling cooperation among concurrent transactions. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 169--178, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. M. Mellor-Crummey. On-the-fly detection of data races for programs with nested fork-join parallelism. In Proceedings of Supercomputing, pages 24--33, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. OpenMP application program interface, version 3.0, May 2008. http://www.openmp.org/mp-documents/spec30.pdf.Google ScholarGoogle Scholar
  23. K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, D. Prountzos, and X. Sui. The tao of parallelism in algorithms. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 12--25, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 65--76, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Rauchwerger and D. Padua. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. C. Rinard and M. S. Lam. The design, implementation, and evaluation of Jade. ACM Transactions on Programming Languages and Systems, 20(3):483--545, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. A. Roback and G. R. Andrews. Gossamer: A lightweight approach to using multicore machines. In Proceedings of the International Conference on Parallel Processing, pages 30--39, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Song and Z. Li. New tiling techniques to improve cache temporal locality. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 215--228, Atlanta, Georgia, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. M. Strout, L. Carter, and J. Ferrante. Compile-time composition of run-time data and iteration reorderings. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 245--257, San Diego, CA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. W. Thies, V. Chandrasekhar, and S. P. Amarasinghe. A practical approach to exploiting coarse-grained pipeline parallelism in c programs. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture, pages 356--369, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or Discard execution model for speculative parallelization on multicores. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture, pages 330--341, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: parallelizing sequential logging and replay. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 15--26, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. von Praun, L. Ceze, and C. Cascaval. Implicit parallelism with ordered transactions. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Mar. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Welc, S. Jagannathan, and A. L. Hosking. Safe futures for Java. In Proceedings of the International Conference on Object Oriented Programming, Systems, Languages and Applications, pages 439--453, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Wonnacott. Achieving scalable locality with time skewing. International Journal of Parallel Programming, 30(3), June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Zhai, J. G. Steffan, C. B. Colohan, and T. C. Mowry. Compiler and hardware support for reducing the synchronization of speculative threads. ACM Transactions on Architecture and Code Optimization, 5(1):1--33, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C. Zhang, C. Ding, X. Gu, K. Kelsey, T. Bai, and X. F. 0002. Continuous speculative program parallelization in software. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 335--336, 2010. poster paper. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Safe parallel programming using dynamic dependence hints

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        OOPSLA '11: Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
        October 2011
        1104 pages
        ISBN:9781450309400
        DOI:10.1145/2048066
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 46, Issue 10
          OOPSLA '11
          October 2011
          1063 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2076021
          Issue’s Table of Contents

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 October 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate268of1,244submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader