skip to main content
10.1145/2854038.2854060acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

A basic linear algebra compiler for structured matrices

Published:29 February 2016Publication History

ABSTRACT

Many problems in science and engineering are in practice modeled and solved through matrix computations. Often, the matrices involved have structure such as symmetric or triangular, which reduces the operations count needed to perform the computation. For example, dense linear systems of equations are solved by first converting to triangular form and optimization problems may yield matrices with any kind of structure. The well-known BLAS (basic linear algebra subroutine) interface provides a small set of structured matrix computations, chosen to serve a certain set of higher level functions (LAPACK). However, if a user encounters a computation or structure that is not supported, she loses the benefits of the structure and chooses a generic library. In this paper, we address this problem by providing a compiler that translates a given basic linear algebra computation on structured matrices into optimized C code, optionally vectorized with intrinsics. Our work combines prior work on the Spiral-like LGen compiler with techniques from polyhedral compilation to mathematically capture matrix structures. In the paper we consider upper/lower triangular and symmetric matrices but the approach is extensible to a much larger set including blocked structures. We run experiments on a modern Intel platform against the Intel MKL library and a baseline implementation showing competitive performance results for both BLAS and non-BLAS functionalities.

Skip Supplemental Material Section

Supplemental Material

References

  1. LGen: A basic linear algebra compiler. Available at http: //spiral.net/software/lgen.html.Google ScholarGoogle Scholar
  2. E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, third edition, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Bastoul. Code generation in the polyhedral model is easier than you think. In Parallel Architectures and Compilation Techniques (PACT), pages 7–16, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. U. Beaugnon, A. Kravets, S. van Haastregt, R. Baghdadi, D. Tweed, J. Absar, and A. Lokhmotov. VOBLA: A vehicle for optimized basic linear algebra. In Languages, Compilers and Tools for Embedded Systems (LCTES), pages 115–124, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In Programming Language Design and Implementation (PLDI), pages 101–113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS), 16(1):1–17, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Fabregat-Traver and P. Bientinesi. A domain-specific compiler for linear algebra operations. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 346–361. Springer, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  8. P. Feautrier and C. Lengauer. Encyclopedia of Parallel Computing, chapter Polyhedron Model. Springer, 2011.Google ScholarGoogle Scholar
  9. F. Franchetti, F. Mesmay, D. Mcfarlin, and M. Püschel. Operator language: A program generation framework for fast kernels. In IFIP Working Conference on Domain-Specific Languages (DSL WC), volume 5658 of Lecture Notes in Computer Science (LNCS), pages 385–410. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Goto and R. A. van de Geijn. Anatomy of high-performance matrix multiplication. ACM Transactions on Mathematical Software (TOMS), 34(3):12:1–12:25, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Goto and R. A. van de Geijn. High-performance implementation of the level-3 BLAS. ACM Transactions on Mathematical Software (TOMS), 35(1):4:1–4:14, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Grosser, A. Groesslinger, and C. Lengauer. Polly — performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters, 22(04):1250010, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  13. G. Guennebaud, B. Jacob, et al. Eigen v3. http://eigen. tuxfamily.org.Google ScholarGoogle Scholar
  14. J. A. Gunnels, F. G. Gustavson, G. Henry, and R. A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Transactions on Mathematical Software (TOMS), 27(4): 422–455, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Intel math kernel library (MKL). http://software.intel. com/en-us/intel-mkl.Google ScholarGoogle Scholar
  16. D. Kim, L. Renganarayanan, D. Rostron, S. Rajopadhye, and M. M. Strout. Multi-level tiling: M for the price of one. In Supercomputing (SC), pages 1–12, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Kyrtatas, D. G. Spampinato, and M. Püschel. A basic linear algebra compiler for embedded processors. In Design, Automation and Test in Europe (DATE), pages 1054–1059, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Marker, J. Poulson, D. Batory, and R. van de Geijn. Designing linear algebra algorithms by transformation: Mechanizing the expert developer. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 362–378. Springer, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93 (2):232–275, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. Püschel, F. Franchetti, and Y. Voronenko. Encyclopedia of Parallel Computing, chapter Spiral. Springer, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. G. Spampinato and M. Püschel. A basic linear algebra compiler. In International Symposium on Code Generation and Optimization (CGO), pages 23–32, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. G. Van Zee and R. A. van de Geijn. BLIS: A framework for rapidly instantiating blas functionality. ACM Transactions on Mathematical Software (TOMS), 41(3):14:1–14:33, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. G. Van Zee, E. Chan, R. A. van de Geijn, E. S. Quintana-Orti, and G. Quintana-Orti. The libFLAME library for dense matrix computations. IEEE Design & Test, 11(6):56––63, Nov. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Veras and F. Franchetti. Capturing the expert: Generating fast matrix-multiply kernels with Spiral. In High Performance Computing for Computational Science (VECPAR 2014), volume 8969 of Lecture Notes in Computer Science (LNCS), pages 236–244. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. Verdoolaege. isl: An integer set library for the polyhedral model. In Mathematical Software (MS), volume 6327 of Lecture Notes in Computer Science (LNCS), pages 299–302. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill. Is search really necessary to generate highperformance BLAS? Proceedings of the IEEE, 93(2):358–386, 2005.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A basic linear algebra compiler for structured matrices

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CGO '16: Proceedings of the 2016 International Symposium on Code Generation and Optimization
        February 2016
        283 pages
        ISBN:9781450337786
        DOI:10.1145/2854038

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 February 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        CGO '16 Paper Acceptance Rate25of108submissions,23%Overall Acceptance Rate312of1,061submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader