ABSTRACT
Fast compilation is essential for JIT-compilation use cases like dynamic languages or databases as well as development productivity when compiling static languages. Template-based compilation allows fast compilation times, but in existing approaches, templates are generally handwritten, limiting flexibility and causing substantial engineering effort.
In this paper, we introduce an approach based on MLIR that derives code templates for the instructions of any dialect automatically ahead-of-time. Template generation re-uses the existing compilation path present in the MLIR lowering of the instructions and thereby inherently supports code generation from different abstraction levels in a single step.
Our results on compiling database queries and standard C programs show a compile-time improvement of 10–30x compared to LLVM -O0 with only moderate run-time slowdowns of 1–3x, resulting in an overall improvement of 2x in a JIT-compilation-based database setting.
- 2009. CoreMark Benchmark. https://www.eembc.org/coremark/ Google Scholar
- 2017. SPEC CPU 2017. https://www.spec.org/cpu2017/ Google Scholar
- 2022. GNU lightning. https://www.gnu.org/software/lightning/manual/lightning.html Google Scholar
- 2023. Flang. https://github.com/llvm/llvm-project/tree/main/flang/ Google Scholar
- 2023. SpiderMonkey. https://spidermonkey.dev/ Google Scholar
- Joel Auslander, Matthai Philipose, Craig Chambers, Susan J. Eggers, and Brian N. Bershad. 1996. Fast, Effective Dynamic Compilation. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 149–159. https://doi.org/10.1145/231379.231409 Google ScholarDigital Library
- Clemens Backes. 2018. Liftoff: a new baseline compiler for WebAssembly in V8. https://v8.dev/blog/liftoff Google Scholar
- Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In USENIX annual technical conference, FREENIX Track. 46. Google Scholar
- Fabrice Bellard. 2009. Tiny Code Generator. https://github.com/qemu/qemu/blob/v4.2.0/tcg/README Google Scholar
- Aart Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, and Fredrik Kjolstad. 2022. Compiler Support for Sparse Tensor Computations in MLIR. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/3544559 Google ScholarDigital Library
- Kevin Casey, David Gregg, M Anton Ertl, and Andrew Nisbet. 2003. Towards Superinstructions for Java Interpreters. In Software and Compilers for Embedded Systems: 7th International Workshop, SCOPES 2003, Vienna, Austria, September 24-26, 2003. Proceedings 7. 329–343. https://doi.org/10.1007/978-3-540-39920-9_23 Google ScholarCross Ref
- Charles Consel and François Noël. 1996. A General Approach for Run-Time Specialization and Its Application to C. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 145–156. https://doi.org/10.1145/237721.237767 Google ScholarDigital Library
- Patrick Damme, Marius Birkenbach, Constantinos Bitsakos, Matthias Boehm, Philippe Bonnet, Florina Ciorba, Mark Dokter, Pawel Dowgiallo, Ahmed Eleliemy, Christian Färber, Georgios Goumas, Dirk Habich, Niclas Hedam, Marlies Hofer, Wenjun Huang, Kevin Innerebner, Vasileios Karakostas, Roman Kern, Tomaž Kosar, and Xiao Zhu. 2022. DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines. In Conference on Innovative Data Systems Research. Google Scholar
- Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Åke Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL server’s memory-optimized OLTP engine. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 1243–1254. https://doi.org/10.1145/2463676.2463710 Google ScholarDigital Library
- Florian Drescher and Alexis Engelke. 2024. Artifact for CC’24 paper on Fast Template-Based Code Generation for MLIR. https://doi.org/10.5281/zenodo.10571103 Google ScholarCross Ref
- Dawson R. Engler. 1996. VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 160–170. https://doi.org/10.1145/231379.231411 Google ScholarDigital Library
- Dawson R. Engler, Wilson C. Hsieh, and M. Frans Kaashoek. 1996. ’C: A Language for High-Level, Efficient, and Machine-Independent Dynamic Code Generation. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 131–144. https://doi.org/10.1145/237721.237765 Google ScholarDigital Library
- Nathaniel Wesley Filardo. 2007. Porting QEMU to plan 9: QEMU internals and port strategy. https://www.contrib.andrew.cmu.edu/user/nwf/paper-strategy.pdf Google Scholar
- Google. 2023. What is V8? https://v8.dev Google Scholar
- Tian Jin, Gheorghe-Teodor Bercea, Tung D Le, Tong Chen, Gong Su, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O’Brien, and Kiyokuni Kawachiya. 2020. Compiling ONNX Neural Network Models using MLIR. arXiv:2008.08272. Google Scholar
- Michael Jungmair, André Kohn, and Jana Giceva. 2022. Designing an Open Framework for Query Optimization and Compilation. Proc. VLDB Endow., https://doi.org/10.14778/3551793.3551801 Google ScholarDigital Library
- Minhaj Ahmad Khan, H-P Charles, and Denis Barthou. 2007. An effective automated Approach to Specialization of Code. In International Workshop on Languages and Compilers for Parallel Computing. 308–322. https://doi.org/10.1007/978-3-540-85261-2_21 Google ScholarDigital Library
- Petr Kobalicek. 2014. AsmJIT Project. https://asmjit.com/ Google Scholar
- Marcel Kornacker, Alexander Behm, Victor Bittorf, Taras Bobrovytsky, Casey Ching, Alan Choi, Justin Erickson, Martin Grund, Daniel Hecht, Matthew Jacobs, Ishaan Joshi, Lenni Kuff, Dileep Kumar, Alex Leblang, Nong Li, Ippokratis Pandis, Henry Robinson, David Rorke, Silvius Rus, John Russell, Dimitris Tsirogiannis, Skye Wanderman-Milne, and Michael Yoder. 2015. Impala: A Modern, Open-Source SQL Engine for Hadoop. In Conference on Innovative Data Systems Research. Google Scholar
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization (CGO). https://doi.org/10.1109/CGO.2004.1281665 Google ScholarCross Ref
- Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. https://doi.org/10.1109/CGO51591.2021.9370308 Google ScholarDigital Library
- Hsin-I Cindy Liu, Marius Brehler, Mahesh Ravishankar, Nicolas Vasilache, Ben Vanik, and Stella Laurenzo. 2022. TinyIREE: An ML Execution Environment for Embedded Systems From Compilation to Deployment. IEEE Micro, 9–16. https://doi.org/10.1109/MM.2022.3178068 Google ScholarDigital Library
- Prashanth Menon, Andrew Pavlo, and Todd C. Mowry. 2017. Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last. Proceedings of the VLDB Endowment, 1–13. https://doi.org/10.14778/3151113.3151114 Google ScholarDigital Library
- William S. Moses, Lorenzo Chelini, Ruizhe Zhao, and Oleksandr Zinenko. 2021. Polygeist: Raising C to Polyhedral MLIR. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 45–59. https://doi.org/10.1109/PACT52795.2021.00011 Google ScholarCross Ref
- William S. Moses, Ivan R. Ivanov, Jens Domke, Toshio Endo, Johannes Doerfert, and Oleksandr Zinenko. 2023. High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 119–134. https://doi.org/10.1145/3572848.3577475 Google ScholarDigital Library
- Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. In Conference on Innovative Data Systems Research. https://api.semanticscholar.org/CorpusID:209379505 Google Scholar
- F. Noel, L. Hornof, C. Consel, and J.L. Lawall. 1998. Automatic, Template-based Runtime Specialization: Implementation and Experimental Study. In Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225). 132–142. https://doi.org/10.1109/ICCL.1998.674164 Google ScholarCross Ref
- Mike Pall. 1999. DynASM. https://luajit.org/dynasm.html Google Scholar
- Filip Pizlo. 2020. Speculation in JavaScriptCore. https://webkit.org/blog/10308/speculation-in-javascriptcore/ Google Scholar
- Massimiliano Poletto, Dawson R. Engler, and M. Frans Kaashoek. 1997. Tcc: A System for Fast, Flexible, and High-Level Dynamic Code Generation. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation. 109–121. https://doi.org/10.1145/258915.258926 Google ScholarDigital Library
- L.-N. Pouchet and T. Yuki. 2015. PolyBench: The Polyhedral Benchmarking suite. https://web.cs.ucla.edu/~pouchet/software/polybench/ Google Scholar
- Todd A. Proebsting. 1995. Optimizing an ANSI C Interpreter with Superoperators. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 322–332. https://doi.org/10.1145/199448.199526 Google ScholarDigital Library
- Gerald Sussman and Guy Steele. 1998. Scheme: A Interpreter for Extended Lambda Calculus. Higher-Order and Symbolic Computation, 12, 405–439. https://doi.org/10.1023/A:1010035624696 Google ScholarDigital Library
- Transaction Processing Performance Council. 2023. TPC Benchmark H. Transaction Processing Performance Council. Google Scholar
- Nicolas Vasilache, Oleksandr Zinenko, Aart J. C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, and Albert Cohen. 2022. Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction. CoRR. Google Scholar
- Christian Wimmer, Michael Haupt, Michael L. Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. 2013. Maxine: An Approachable Virtual Machine for, and in, Java. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/2400682.2400689 Google ScholarDigital Library
- Haoran Xu and Fredrik Kjolstad. 2021. Copy-and-Patch Compilation: A Fast Compilation Algorithm for High-Level Languages and Bytecode. Proc. ACM Program. Lang., https://doi.org/10.1145/3485513 Google ScholarDigital Library
Index Terms
- Fast Template-Based Code Generation for MLIR
Recommendations
Copy-and-patch compilation: a fast compilation algorithm for high-level languages and bytecode
Fast compilation is important when compilation occurs at runtime, such as query compilers in modern database systems and WebAssembly virtual machines in modern browsers. We present copy-and-patch, an extremely fast compilation technique that also ...
Reuse of JIT compiled code based on binary code patching in JavaScript engine
JavaScript is a core language of web applications. As the most frequently used web language, it is used in more than 90% of web pages around the world. As a result, the performance of JavaScript engines becomes an important issue. In order to increase ...
Surgical precision JIT compilers
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and ImplementationJust-in-time (JIT) compilation of running programs provides more optimization opportunities than offline compilation. Modern JIT compilers, such as those in virtual machines like Oracle's HotSpot for Java or Google's V8 for JavaScript, rely on dynamic ...
Comments