skip to main content
10.1145/2854038.2854060acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

A basic linear algebra compiler for structured matrices

Published: 29 February 2016 Publication History

Abstract

Many problems in science and engineering are in practice modeled and solved through matrix computations. Often, the matrices involved have structure such as symmetric or triangular, which reduces the operations count needed to perform the computation. For example, dense linear systems of equations are solved by first converting to triangular form and optimization problems may yield matrices with any kind of structure. The well-known BLAS (basic linear algebra subroutine) interface provides a small set of structured matrix computations, chosen to serve a certain set of higher level functions (LAPACK). However, if a user encounters a computation or structure that is not supported, she loses the benefits of the structure and chooses a generic library. In this paper, we address this problem by providing a compiler that translates a given basic linear algebra computation on structured matrices into optimized C code, optionally vectorized with intrinsics. Our work combines prior work on the Spiral-like LGen compiler with techniques from polyhedral compilation to mathematically capture matrix structures. In the paper we consider upper/lower triangular and symmetric matrices but the approach is extensible to a much larger set including blocked structures. We run experiments on a modern Intel platform against the Intel MKL library and a baseline implementation showing competitive performance results for both BLAS and non-BLAS functionalities.

Supplementary Material

Auxiliary Archive (p117-spampinato-s.zip)
The auxiliary material contains two files: ae.pdf (guideline to install and run the artifact) and lgen-ae.tar.gz (the artifact's compressed folder).

References

[1]
LGen: A basic linear algebra compiler. Available at http: //spiral.net/software/lgen.html.
[2]
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, third edition, 1999.
[3]
C. Bastoul. Code generation in the polyhedral model is easier than you think. In Parallel Architectures and Compilation Techniques (PACT), pages 7–16, 2004.
[4]
U. Beaugnon, A. Kravets, S. van Haastregt, R. Baghdadi, D. Tweed, J. Absar, and A. Lokhmotov. VOBLA: A vehicle for optimized basic linear algebra. In Languages, Compilers and Tools for Embedded Systems (LCTES), pages 115–124, 2014.
[5]
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In Programming Language Design and Implementation (PLDI), pages 101–113, 2008.
[6]
J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS), 16(1):1–17, 1990.
[7]
D. Fabregat-Traver and P. Bientinesi. A domain-specific compiler for linear algebra operations. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 346–361. Springer, 2013.
[8]
P. Feautrier and C. Lengauer. Encyclopedia of Parallel Computing, chapter Polyhedron Model. Springer, 2011.
[9]
F. Franchetti, F. Mesmay, D. Mcfarlin, and M. Püschel. Operator language: A program generation framework for fast kernels. In IFIP Working Conference on Domain-Specific Languages (DSL WC), volume 5658 of Lecture Notes in Computer Science (LNCS), pages 385–410. Springer, 2009.
[10]
K. Goto and R. A. van de Geijn. Anatomy of high-performance matrix multiplication. ACM Transactions on Mathematical Software (TOMS), 34(3):12:1–12:25, 2008.
[11]
K. Goto and R. A. van de Geijn. High-performance implementation of the level-3 BLAS. ACM Transactions on Mathematical Software (TOMS), 35(1):4:1–4:14, 2008.
[12]
T. Grosser, A. Groesslinger, and C. Lengauer. Polly — performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters, 22(04):1250010, 2012.
[13]
G. Guennebaud, B. Jacob, et al. Eigen v3. http://eigen. tuxfamily.org.
[14]
J. A. Gunnels, F. G. Gustavson, G. Henry, and R. A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Transactions on Mathematical Software (TOMS), 27(4): 422–455, 2001.
[15]
Intel math kernel library (MKL). http://software.intel. com/en-us/intel-mkl.
[16]
D. Kim, L. Renganarayanan, D. Rostron, S. Rajopadhye, and M. M. Strout. Multi-level tiling: M for the price of one. In Supercomputing (SC), pages 1–12, 2007.
[17]
N. Kyrtatas, D. G. Spampinato, and M. Püschel. A basic linear algebra compiler for embedded processors. In Design, Automation and Test in Europe (DATE), pages 1054–1059, 2015.
[18]
B. Marker, J. Poulson, D. Batory, and R. van de Geijn. Designing linear algebra algorithms by transformation: Mechanizing the expert developer. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 362–378. Springer, 2013.
[19]
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93 (2):232–275, 2005.
[20]
M. Püschel, F. Franchetti, and Y. Voronenko. Encyclopedia of Parallel Computing, chapter Spiral. Springer, 2011.
[21]
D. G. Spampinato and M. Püschel. A basic linear algebra compiler. In International Symposium on Code Generation and Optimization (CGO), pages 23–32, 2014.
[22]
F. G. Van Zee and R. A. van de Geijn. BLIS: A framework for rapidly instantiating blas functionality. ACM Transactions on Mathematical Software (TOMS), 41(3):14:1–14:33, 2015.
[23]
F. G. Van Zee, E. Chan, R. A. van de Geijn, E. S. Quintana-Orti, and G. Quintana-Orti. The libFLAME library for dense matrix computations. IEEE Design & Test, 11(6):56––63, Nov. 2009.
[24]
R. Veras and F. Franchetti. Capturing the expert: Generating fast matrix-multiply kernels with Spiral. In High Performance Computing for Computational Science (VECPAR 2014), volume 8969 of Lecture Notes in Computer Science (LNCS), pages 236–244. Springer, 2015.
[25]
S. Verdoolaege. isl: An integer set library for the polyhedral model. In Mathematical Software (MS), volume 6327 of Lecture Notes in Computer Science (LNCS), pages 299–302. Springer, 2010.
[26]
K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill. Is search really necessary to generate highperformance BLAS? Proceedings of the IEEE, 93(2):358–386, 2005.

Cited By

View all
  • (2025)Synthesis of Quantum Simulators by CompilationProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708949(284-298)Online publication date: 1-Mar-2025
  • (2025)SySTeC: A Symmetric Sparse Tensor CompilerProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708919(47-62)Online publication date: 1-Mar-2025
  • (2023)Compiling Structured Tensor AlgebraProceedings of the ACM on Programming Languages10.1145/36228047:OOPSLA2(204-233)Online publication date: 16-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO '16: Proceedings of the 2016 International Symposium on Code Generation and Optimization
February 2016
283 pages
ISBN:9781450337786
DOI:10.1145/2854038
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 February 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Basic linear algebra
  2. DSL
  3. Program synthesis
  4. SIMD vectorization
  5. Structured matrices
  6. Tiling

Qualifiers

  • Research-article

Conference

CGO '16

Acceptance Rates

CGO '16 Paper Acceptance Rate 25 of 108 submissions, 23%;
Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)12
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Synthesis of Quantum Simulators by CompilationProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708949(284-298)Online publication date: 1-Mar-2025
  • (2025)SySTeC: A Symmetric Sparse Tensor CompilerProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708919(47-62)Online publication date: 1-Mar-2025
  • (2023)Compiling Structured Tensor AlgebraProceedings of the ACM on Programming Languages10.1145/36228047:OOPSLA2(204-233)Online publication date: 16-Oct-2023
  • (2022)ReACTProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569685(1-13)Online publication date: 8-Oct-2022
  • (2022)The Linear Algebra Mapping Problem. Current State of Linear Algebra Languages and LibrariesACM Transactions on Mathematical Software10.1145/354993548:3(1-30)Online publication date: 10-Sep-2022
  • (2020)Meta-programming for cross-domain tensor optimizationsACM SIGPLAN Notices10.1145/3393934.327813153:9(79-92)Online publication date: 7-Apr-2020
  • (2020)Synthesis of Incremental Linear Algebra ProgramsACM Transactions on Database Systems10.1145/338539845:3(1-44)Online publication date: 26-Aug-2020
  • (2019)Efficient differentiable programming in a functional array-processing languageProceedings of the ACM on Programming Languages10.1145/33417013:ICFP(1-30)Online publication date: 26-Jul-2019
  • (2019)Toward Modeling Cache-Miss Ratio for Dense-Data-Access-Based OptimizationProceedings of the 30th International Workshop on Rapid System Prototyping (RSP'19)10.1145/3339985.3358498(64-70)Online publication date: 17-Oct-2019
  • (2019)Optimizing tensor contractions for embedded devices with racetrack memory scratch-padsProceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3316482.3326351(5-18)Online publication date: 23-Jun-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media