research-article

Fast Template-Based Code Generation for MLIR

Authors:
Florian Drescher

Technical University Munich, Munich, Germany

Technical University Munich, Munich, Germany

0009-0004-7333-3401
View Profile

,
Alexis Engelke

Technical University Munich, Munich, Germany

Technical University Munich, Munich, Germany

0000-0003-1900-1292
View Profile

CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler ConstructionFebruary 2024Pages 1–12https://doi.org/10.1145/3640537.3641567

Published:20 February 2024Publication History

Related Artifact: Artifact for CC'24 paper on "Fast Template-Based Code Generation for MLIR" February 2024 software https://doi.org/10.5281/zenodo.10571103

CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction

Pages 1–12

ABSTRACT

Fast compilation is essential for JIT-compilation use cases like dynamic languages or databases as well as development productivity when compiling static languages. Template-based compilation allows fast compilation times, but in existing approaches, templates are generally handwritten, limiting flexibility and causing substantial engineering effort.

In this paper, we introduce an approach based on MLIR that derives code templates for the instructions of any dialect automatically ahead-of-time. Template generation re-uses the existing compilation path present in the MLIR lowering of the instructions and thereby inherently supports code generation from different abstraction levels in a single step.

Our results on compiling database queries and standard C programs show a compile-time improvement of 10–30x compared to LLVM -O0 with only moderate run-time slowdowns of 1–3x, resulting in an overall improvement of 2x in a JIT-compilation-based database setting.

References

2009. CoreMark Benchmark. https://www.eembc.org/coremark/ Google Scholar
2017. SPEC CPU 2017. https://www.spec.org/cpu2017/ Google Scholar
2022. GNU lightning. https://www.gnu.org/software/lightning/manual/lightning.html Google Scholar
2023. Flang. https://github.com/llvm/llvm-project/tree/main/flang/ Google Scholar
2023. SpiderMonkey. https://spidermonkey.dev/ Google Scholar
Joel Auslander, Matthai Philipose, Craig Chambers, Susan J. Eggers, and Brian N. Bershad. 1996. Fast, Effective Dynamic Compilation. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 149–159. https://doi.org/10.1145/231379.231409 Google ScholarDigital Library
Clemens Backes. 2018. Liftoff: a new baseline compiler for WebAssembly in V8. https://v8.dev/blog/liftoff Google Scholar
Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In USENIX annual technical conference, FREENIX Track. 46. Google Scholar
Fabrice Bellard. 2009. Tiny Code Generator. https://github.com/qemu/qemu/blob/v4.2.0/tcg/README Google Scholar
Aart Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, and Fredrik Kjolstad. 2022. Compiler Support for Sparse Tensor Computations in MLIR. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/3544559 Google ScholarDigital Library
Kevin Casey, David Gregg, M Anton Ertl, and Andrew Nisbet. 2003. Towards Superinstructions for Java Interpreters. In Software and Compilers for Embedded Systems: 7th International Workshop, SCOPES 2003, Vienna, Austria, September 24-26, 2003. Proceedings 7. 329–343. https://doi.org/10.1007/978-3-540-39920-9_23 Google ScholarCross Ref
Charles Consel and François Noël. 1996. A General Approach for Run-Time Specialization and Its Application to C. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 145–156. https://doi.org/10.1145/237721.237767 Google ScholarDigital Library
Patrick Damme, Marius Birkenbach, Constantinos Bitsakos, Matthias Boehm, Philippe Bonnet, Florina Ciorba, Mark Dokter, Pawel Dowgiallo, Ahmed Eleliemy, Christian Färber, Georgios Goumas, Dirk Habich, Niclas Hedam, Marlies Hofer, Wenjun Huang, Kevin Innerebner, Vasileios Karakostas, Roman Kern, Tomaž Kosar, and Xiao Zhu. 2022. DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines. In Conference on Innovative Data Systems Research. Google Scholar
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Åke Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL server’s memory-optimized OLTP engine. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 1243–1254. https://doi.org/10.1145/2463676.2463710 Google ScholarDigital Library
Florian Drescher and Alexis Engelke. 2024. Artifact for CC’24 paper on Fast Template-Based Code Generation for MLIR. https://doi.org/10.5281/zenodo.10571103 Google ScholarCross Ref
Dawson R. Engler. 1996. VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 160–170. https://doi.org/10.1145/231379.231411 Google ScholarDigital Library
Dawson R. Engler, Wilson C. Hsieh, and M. Frans Kaashoek. 1996. ’C: A Language for High-Level, Efficient, and Machine-Independent Dynamic Code Generation. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 131–144. https://doi.org/10.1145/237721.237765 Google ScholarDigital Library
Nathaniel Wesley Filardo. 2007. Porting QEMU to plan 9: QEMU internals and port strategy. https://www.contrib.andrew.cmu.edu/user/nwf/paper-strategy.pdf Google Scholar
Google. 2023. What is V8? https://v8.dev Google Scholar
Tian Jin, Gheorghe-Teodor Bercea, Tung D Le, Tong Chen, Gong Su, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O’Brien, and Kiyokuni Kawachiya. 2020. Compiling ONNX Neural Network Models using MLIR. arXiv:2008.08272. Google Scholar
Michael Jungmair, André Kohn, and Jana Giceva. 2022. Designing an Open Framework for Query Optimization and Compilation. Proc. VLDB Endow., https://doi.org/10.14778/3551793.3551801 Google ScholarDigital Library
Minhaj Ahmad Khan, H-P Charles, and Denis Barthou. 2007. An effective automated Approach to Specialization of Code. In International Workshop on Languages and Compilers for Parallel Computing. 308–322. https://doi.org/10.1007/978-3-540-85261-2_21 Google ScholarDigital Library
Petr Kobalicek. 2014. AsmJIT Project. https://asmjit.com/ Google Scholar
Marcel Kornacker, Alexander Behm, Victor Bittorf, Taras Bobrovytsky, Casey Ching, Alan Choi, Justin Erickson, Martin Grund, Daniel Hecht, Matthew Jacobs, Ishaan Joshi, Lenni Kuff, Dileep Kumar, Alex Leblang, Nong Li, Ippokratis Pandis, Henry Robinson, David Rorke, Silvius Rus, John Russell, Dimitris Tsirogiannis, Skye Wanderman-Milne, and Michael Yoder. 2015. Impala: A Modern, Open-Source SQL Engine for Hadoop. In Conference on Innovative Data Systems Research. Google Scholar
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization (CGO). https://doi.org/10.1109/CGO.2004.1281665 Google ScholarCross Ref
Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. https://doi.org/10.1109/CGO51591.2021.9370308 Google ScholarDigital Library
Hsin-I Cindy Liu, Marius Brehler, Mahesh Ravishankar, Nicolas Vasilache, Ben Vanik, and Stella Laurenzo. 2022. TinyIREE: An ML Execution Environment for Embedded Systems From Compilation to Deployment. IEEE Micro, 9–16. https://doi.org/10.1109/MM.2022.3178068 Google ScholarDigital Library
Prashanth Menon, Andrew Pavlo, and Todd C. Mowry. 2017. Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last. Proceedings of the VLDB Endowment, 1–13. https://doi.org/10.14778/3151113.3151114 Google ScholarDigital Library
William S. Moses, Lorenzo Chelini, Ruizhe Zhao, and Oleksandr Zinenko. 2021. Polygeist: Raising C to Polyhedral MLIR. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 45–59. https://doi.org/10.1109/PACT52795.2021.00011 Google ScholarCross Ref
William S. Moses, Ivan R. Ivanov, Jens Domke, Toshio Endo, Johannes Doerfert, and Oleksandr Zinenko. 2023. High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 119–134. https://doi.org/10.1145/3572848.3577475 Google ScholarDigital Library
Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. In Conference on Innovative Data Systems Research. https://api.semanticscholar.org/CorpusID:209379505 Google Scholar
F. Noel, L. Hornof, C. Consel, and J.L. Lawall. 1998. Automatic, Template-based Runtime Specialization: Implementation and Experimental Study. In Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225). 132–142. https://doi.org/10.1109/ICCL.1998.674164 Google ScholarCross Ref
Mike Pall. 1999. DynASM. https://luajit.org/dynasm.html Google Scholar
Filip Pizlo. 2020. Speculation in JavaScriptCore. https://webkit.org/blog/10308/speculation-in-javascriptcore/ Google Scholar
Massimiliano Poletto, Dawson R. Engler, and M. Frans Kaashoek. 1997. Tcc: A System for Fast, Flexible, and High-Level Dynamic Code Generation. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation. 109–121. https://doi.org/10.1145/258915.258926 Google ScholarDigital Library
L.-N. Pouchet and T. Yuki. 2015. PolyBench: The Polyhedral Benchmarking suite. https://web.cs.ucla.edu/~pouchet/software/polybench/ Google Scholar
Todd A. Proebsting. 1995. Optimizing an ANSI C Interpreter with Superoperators. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 322–332. https://doi.org/10.1145/199448.199526 Google ScholarDigital Library
Gerald Sussman and Guy Steele. 1998. Scheme: A Interpreter for Extended Lambda Calculus. Higher-Order and Symbolic Computation, 12, 405–439. https://doi.org/10.1023/A:1010035624696 Google ScholarDigital Library
Transaction Processing Performance Council. 2023. TPC Benchmark H. Transaction Processing Performance Council. Google Scholar
Nicolas Vasilache, Oleksandr Zinenko, Aart J. C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, and Albert Cohen. 2022. Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction. CoRR. Google Scholar
Christian Wimmer, Michael Haupt, Michael L. Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. 2013. Maxine: An Approachable Virtual Machine for, and in, Java. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/2400682.2400689 Google ScholarDigital Library
Haoran Xu and Fredrik Kjolstad. 2021. Copy-and-Patch Compilation: A Fast Compilation Algorithm for High-Level Languages and Bytecode. Proc. ACM Program. Lang., https://doi.org/10.1145/3485513 Google ScholarDigital Library

Index Terms

Fast Template-Based Code Generation for MLIR
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Just-in-time compilers
      2. Translator writing systems and compiler generators

Recommendations

Copy-and-patch compilation: a fast compilation algorithm for high-level languages and bytecode

Fast compilation is important when compilation occurs at runtime, such as query compilers in modern database systems and WebAssembly virtual machines in modern browsers. We present copy-and-patch, an extremely fast compilation technique that also ...
Read More
Reuse of JIT compiled code based on binary code patching in JavaScript engine

JavaScript is a core language of web applications. As the most frequently used web language, it is used in more than 90% of web pages around the world. As a result, the performance of JavaScript engines becomes an important issue. In order to increase ...
Read More
Surgical precision JIT compilers
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation

Just-in-time (JIT) compilation of running programs provides more optimization opportunities than offline compilation. Modern JIT compilers, such as those in virtual machines like Oracle's HotSpot for Java or Google's V8 for JavaScript, rely on dynamic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction
February 2024
261 pages
ISBN:9798400705076
DOI:10.1145/3640537
General Chair:
Gabriel Rodríguez
Universidade da Coruña, Spain
,
Program Chairs:
P. Sadayappan
University of Utah, USA
,
Aravind Sukumaran-Rajam
Meta, USA
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 February 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
Author Tags
Binary Code Patching
Fast Compilation
JIT Compilation
MLIR
Template-based Compilation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 282
  Total Downloads
- Downloads (Last 12 months)282
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast Template-Based Code Generation for MLIR

CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Copy-and-patch compilation: a fast compilation algorithm for high-level languages and bytecode

Reuse of JIT compiled code based on binary code patching in JavaScript engine

Surgical precision JIT compilers