skip to main content
10.1145/1993498.1993554acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Automatic parallelization via matrix multiplication

Published:04 June 2011Publication History

ABSTRACT

Existing work that deals with parallelization of complicated reductions and scans focuses only on formalism and hardly dealt with implementation. To bridge the gap between formalism and implementation, we have integrated parallelization via matrix multiplication into compiler construction. Our framework can deal with complicated loops that existing techniques in compilers cannot parallelize. Moreover, we have sophisticated our framework by developing two sets of techniques. One enhances its capability for parallelization by extracting max-operators automatically, and the other improves the performance of parallelized programs by eliminating redundancy. We have also implemented our framework and techniques as a parallelizer in a compiler. Experiments on examples that existing compilers cannot parallelize have demonstrated the scalability of programs parallelized by our implementation.

References

  1. A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, second edition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. J. C. Bik, M. Girkar, P. M. Grey, and X. Tian. Automatic Intra-Register Vectorization for the Intel® Architecture. Int. J. Parallel Program., 30 (2): 65--98, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. S. Bird. An Introduction to the Theory of Lists. In Logic of Programming and Calculi of Discrete Design, volume 36 of NATO ASI Series F, pages 3--42. Springer-Verlag, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Callahan, S. Carr, and K. Kennedy. Improving Register Allocation for Subscripted Variables. In Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (PLDI '90), pages 177--187. ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W.-N. Chin, A. Takano, and Z. Hu. Parallelization via Context Preservation. In Proceedings of IEEE International Conference on Computer Languages (ICCL '98), pages 153--162. IEEE CS Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Emoto, K. Matsuzaki, Z. Hu, and M. Takeichi. Domain-Specific Optimization Strategy for Skeleton Programs. In Euro-Par 2007 Parallel Processing, volume 4641 of Lecture Notes in Computer Science, pages 705--714. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. L. Fisher and A. M. Ghuloum. Parallelizing Complex Scans and Reductions. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (PLDI '94), pages 135--146. ACM, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Gander and G. H. Golub. Cyclic Reduction -- History and Applications. In Proceedings of the Workshop on Scientific Computing, 1997.Google ScholarGoogle Scholar
  10. P. M. Kogge and H. S. Stone. A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations. IEEE Trans. Comput., 22 (8): 786--793, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Matsuzaki. Parallel Programming with Tree Skeletons. PhD thesis, Graduate School of Information Science and Technology, The University of Tokyo, 2007.Google ScholarGoogle Scholar
  12. K. Matsuzaki and K. Emoto. Implementing Fusion-Equipped Parallel Skeletons by Expression Templates. In Implementation and Application of Functional Languages (IFL '09), volume 6041 of Lecture Notes in Computer Science, pages 72--89. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Matsuzaki, Z. Hu, and M. Takeichi. Towards Automatic Parallelization of Tree Reductions in Dynamic Programming. In Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '06), pages 39--48. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Morihata and K. Matsuzaki. Automatic Parallelization of Recursive Functions using Quantifier Elimination. In Functional and Logic Programming (FLOPS '10), volume 6009 of Lecture Notes in Computer Science, pages 321--336. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Morita, A. Morihata, K. Matsuzaki, Z. Hu, and M. Takeichi. Automatic Inversion Generates Divide-and-Conquer Parallel Programs. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '07), pages 146--155, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Nistor, W.-N. Chin, T.-S. Tan, and N. Tapus. Optimizing the parallel computation of linear recurrences using compact matrix representations. J. Parallel Distrib. Comput., 69 (4): 373--381, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Redon and P. Feautrier. Detection of Scans in the Polytope Model. Parallel Algorithms Appl., 15 (3--4): 229--263, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. H. Reif, editor. Synthesis of Parallel Algorithms. Morgan Kaufmann Pub, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Sato. Automatic Parallelization via Matrix Multiplication. Master's thesis, The University of Electro-Communications, 2011.Google ScholarGoogle Scholar
  20. H. S. Stone. An Efficient Parallel Algorithm for the Solution of a Tridiagonal Linear System of Equations. J. ACM, 20 (1): 27--38, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. N. Xu, S.-C. Khoo, and Z. Hu. PType System: A Featherweight Parallelizability Detector. In Programming Languages and Systems (APLAS '04), volume 3302 of Lecture Notes in Computer Science, pages 197--212. Springer, 2004.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automatic parallelization via matrix multiplication

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2011
      668 pages
      ISBN:9781450306638
      DOI:10.1145/1993498
      • General Chair:
      • Mary Hall,
      • Program Chair:
      • David Padua
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 46, Issue 6
        PLDI '11
        June 2011
        652 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1993316
        Issue’s Table of Contents

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 June 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader