research-article

A basic linear algebra compiler for structured matrices

Authors:

Daniele G. Spampinato,

Markus PüschelAuthors Info & Claims

CGO '16: Proceedings of the 2016 International Symposium on Code Generation and Optimization

Pages 117 - 127

https://doi.org/10.1145/2854038.2854060

Published: 29 February 2016 Publication History

Abstract

Many problems in science and engineering are in practice modeled and solved through matrix computations. Often, the matrices involved have structure such as symmetric or triangular, which reduces the operations count needed to perform the computation. For example, dense linear systems of equations are solved by first converting to triangular form and optimization problems may yield matrices with any kind of structure. The well-known BLAS (basic linear algebra subroutine) interface provides a small set of structured matrix computations, chosen to serve a certain set of higher level functions (LAPACK). However, if a user encounters a computation or structure that is not supported, she loses the benefits of the structure and chooses a generic library. In this paper, we address this problem by providing a compiler that translates a given basic linear algebra computation on structured matrices into optimized C code, optionally vectorized with intrinsics. Our work combines prior work on the Spiral-like LGen compiler with techniques from polyhedral compilation to mathematically capture matrix structures. In the paper we consider upper/lower triangular and symmetric matrices but the approach is extensible to a much larger set including blocked structures. We run experiments on a modern Intel platform against the Intel MKL library and a baseline implementation showing competitive performance results for both BLAS and non-BLAS functionalities.

Supplementary Material

Auxiliary Archive (p117-spampinato-s.zip)

The auxiliary material contains two files: ae.pdf (guideline to install and run the artifact) and lgen-ae.tar.gz (the artifact's compressed folder).

Download
85.67 MB

References

[1]

LGen: A basic linear algebra compiler. Available at http: //spiral.net/software/lgen.html.

[2]

E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, third edition, 1999.

Digital Library

[3]

C. Bastoul. Code generation in the polyhedral model is easier than you think. In Parallel Architectures and Compilation Techniques (PACT), pages 7–16, 2004.

Digital Library

[4]

U. Beaugnon, A. Kravets, S. van Haastregt, R. Baghdadi, D. Tweed, J. Absar, and A. Lokhmotov. VOBLA: A vehicle for optimized basic linear algebra. In Languages, Compilers and Tools for Embedded Systems (LCTES), pages 115–124, 2014.

Digital Library

[5]

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In Programming Language Design and Implementation (PLDI), pages 101–113, 2008.

Digital Library

[6]

J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS), 16(1):1–17, 1990.

Digital Library

[7]

D. Fabregat-Traver and P. Bientinesi. A domain-specific compiler for linear algebra operations. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 346–361. Springer, 2013.

[8]

P. Feautrier and C. Lengauer. Encyclopedia of Parallel Computing, chapter Polyhedron Model. Springer, 2011.

[9]

F. Franchetti, F. Mesmay, D. Mcfarlin, and M. Püschel. Operator language: A program generation framework for fast kernels. In IFIP Working Conference on Domain-Specific Languages (DSL WC), volume 5658 of Lecture Notes in Computer Science (LNCS), pages 385–410. Springer, 2009.

Digital Library

[10]

K. Goto and R. A. van de Geijn. Anatomy of high-performance matrix multiplication. ACM Transactions on Mathematical Software (TOMS), 34(3):12:1–12:25, 2008.

Digital Library

[11]

K. Goto and R. A. van de Geijn. High-performance implementation of the level-3 BLAS. ACM Transactions on Mathematical Software (TOMS), 35(1):4:1–4:14, 2008.

Digital Library

[12]

T. Grosser, A. Groesslinger, and C. Lengauer. Polly — performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters, 22(04):1250010, 2012.

[13]

G. Guennebaud, B. Jacob, et al. Eigen v3. http://eigen. tuxfamily.org.

[14]

J. A. Gunnels, F. G. Gustavson, G. Henry, and R. A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Transactions on Mathematical Software (TOMS), 27(4): 422–455, 2001.

Digital Library

[15]

Intel math kernel library (MKL). http://software.intel. com/en-us/intel-mkl.

[16]

D. Kim, L. Renganarayanan, D. Rostron, S. Rajopadhye, and M. M. Strout. Multi-level tiling: M for the price of one. In Supercomputing (SC), pages 1–12, 2007.

Digital Library

[17]

N. Kyrtatas, D. G. Spampinato, and M. Püschel. A basic linear algebra compiler for embedded processors. In Design, Automation and Test in Europe (DATE), pages 1054–1059, 2015.

Digital Library

[18]

B. Marker, J. Poulson, D. Batory, and R. van de Geijn. Designing linear algebra algorithms by transformation: Mechanizing the expert developer. In High Performance Computing for Computational Science (VECPAR 2012), volume 7851 of Lecture Notes in Computer Science (LNCS), pages 362–378. Springer, 2013.

[19]

M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93 (2):232–275, 2005.

[20]

M. Püschel, F. Franchetti, and Y. Voronenko. Encyclopedia of Parallel Computing, chapter Spiral. Springer, 2011.

Digital Library

[21]

D. G. Spampinato and M. Püschel. A basic linear algebra compiler. In International Symposium on Code Generation and Optimization (CGO), pages 23–32, 2014.

Digital Library

[22]

F. G. Van Zee and R. A. van de Geijn. BLIS: A framework for rapidly instantiating blas functionality. ACM Transactions on Mathematical Software (TOMS), 41(3):14:1–14:33, 2015.

Digital Library

[23]

F. G. Van Zee, E. Chan, R. A. van de Geijn, E. S. Quintana-Orti, and G. Quintana-Orti. The libFLAME library for dense matrix computations. IEEE Design & Test, 11(6):56––63, Nov. 2009.

Digital Library

[24]

R. Veras and F. Franchetti. Capturing the expert: Generating fast matrix-multiply kernels with Spiral. In High Performance Computing for Computational Science (VECPAR 2014), volume 8969 of Lecture Notes in Computer Science (LNCS), pages 236–244. Springer, 2015.

[25]

S. Verdoolaege. isl: An integer set library for the polyhedral model. In Mathematical Software (MS), volume 6327 of Lecture Notes in Computer Science (LNCS), pages 299–302. Springer, 2010.

Digital Library

[26]

K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill. Is search really necessary to generate highperformance BLAS? Proceedings of the IEEE, 93(2):358–386, 2005.

Cited By

Tarabkhah MDelavar MDoosti MShaikhha ADoerfert JGrosser TLeather HSadayappan P(2025)Synthesis of Quantum Simulators by CompilationProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708949(284-298)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708949
Patel RAhrens WAmarasinghe SDoerfert JGrosser TLeather HSadayappan P(2025)SySTeC: A Symmetric Sparse Tensor CompilerProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708919(47-62)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708919
Ghorbani MHuot MHashemian SShaikhha A(2023)Compiling Structured Tensor AlgebraProceedings of the ACM on Programming Languages10.1145/36228047:OOPSLA2(204-233)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622804
Show More Cited By

Index Terms

A basic linear algebra compiler for structured matrices
1. Mathematics of computing
  1. Mathematical software
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

A Basic Linear Algebra Compiler
CGO '14: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization

Many applications in media processing, control, graphics, and other domains require efficient small-scale linear algebra computations. However, most existing high performance libraries for linear algebra, such as ATLAS or Intel MKL are more geared ...
A Basic Linear Algebra Compiler
CGO '14: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization

Many applications in media processing, control, graphics, and other domains require efficient small-scale linear algebra computations. However, most existing high performance libraries for linear algebra, such as ATLAS or Intel MKL are more geared ...
Stability Issues in the Factorization of Structured Matrices

This paper provides an error analysis of the generalized Schur algorithm of Kailath and Chun [SIAM J. Matrix Anal. Appl., 15 (1994), pp. 114--128]---a class of algorithms which can be used to factorize Toeplitz-like matrices, including block-Toeplitz ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CGO '16: Proceedings of the 2016 International Symposium on Code Generation and Optimization

February 2016

283 pages

ISBN:9781450337786

DOI:10.1145/2854038

General Chair:
Bjoern Franke
University of Edinburgh, UK
,
Program Chairs:
Youfeng Wu
Intel, USA
,
Fabrice Rastello
Inria, France

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 February 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CGO '16

Sponsor:

CGO '16: 14th Annual IEEE/ACM International Symposium on Code Generation and Optimization

March 12 - 18, 2016

Barcelona, Spain

Acceptance Rates

CGO '16 Paper Acceptance Rate 25 of 108 submissions, 23%;

Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
565
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)12

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tarabkhah MDelavar MDoosti MShaikhha ADoerfert JGrosser TLeather HSadayappan P(2025)Synthesis of Quantum Simulators by CompilationProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708949(284-298)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708949
Patel RAhrens WAmarasinghe SDoerfert JGrosser TLeather HSadayappan P(2025)SySTeC: A Symmetric Sparse Tensor CompilerProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708919(47-62)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708919
Ghorbani MHuot MHashemian SShaikhha A(2023)Compiling Structured Tensor AlgebraProceedings of the ACM on Programming Languages10.1145/36228047:OOPSLA2(204-233)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622804
Zhou TTian RAshraf RGioiosa RKestor GSarkar VKloeckner AMoreira J(2022)ReACTProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569685(1-13)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1145/3559009.3569685
Psarras CBarthels HBientinesi P(2022)The Linear Algebra Mapping Problem. Current State of Linear Algebra Languages and LibrariesACM Transactions on Mathematical Software10.1145/354993548:3(1-30)Online publication date: 10-Sep-2022
https://dl.acm.org/doi/10.1145/3549935
Susungi ARink NCohen ACastrillon JTadonki C(2020)Meta-programming for cross-domain tensor optimizationsACM SIGPLAN Notices10.1145/3393934.327813153:9(79-92)Online publication date: 7-Apr-2020
https://dl.acm.org/doi/10.1145/3393934.3278131
Shaikhha AElseidy MMihaila SEspino DKoch C(2020)Synthesis of Incremental Linear Algebra ProgramsACM Transactions on Database Systems10.1145/338539845:3(1-44)Online publication date: 26-Aug-2020
https://dl.acm.org/doi/10.1145/3385398
Shaikhha AFitzgibbon AVytiniotis DPeyton Jones S(2019)Efficient differentiable programming in a functional array-processing languageProceedings of the ACM on Programming Languages10.1145/33417013:ICFP(1-30)Online publication date: 26-Jul-2019
https://dl.acm.org/doi/10.1145/3341701
Lakhdar RCharles HKooli M(2019)Toward Modeling Cache-Miss Ratio for Dense-Data-Access-Based OptimizationProceedings of the 30th International Workshop on Rapid System Prototyping (RSP'19)10.1145/3339985.3358498(64-70)Online publication date: 17-Oct-2019
https://dl.acm.org/doi/10.1145/3339985.3358498
Khan ARink NHameed FCastrillon JChen JShrivastava A(2019)Optimizing tensor contractions for embedded devices with racetrack memory scratch-padsProceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3316482.3326351(5-18)Online publication date: 23-Jun-2019
https://dl.acm.org/doi/10.1145/3316482.3326351
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten