skip to main content
10.1145/3640537.3641567acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections

Fast Template-Based Code Generation for MLIR

Published:20 February 2024Publication History

ABSTRACT

Fast compilation is essential for JIT-compilation use cases like dynamic languages or databases as well as development productivity when compiling static languages. Template-based compilation allows fast compilation times, but in existing approaches, templates are generally handwritten, limiting flexibility and causing substantial engineering effort.

In this paper, we introduce an approach based on MLIR that derives code templates for the instructions of any dialect automatically ahead-of-time. Template generation re-uses the existing compilation path present in the MLIR lowering of the instructions and thereby inherently supports code generation from different abstraction levels in a single step.

Our results on compiling database queries and standard C programs show a compile-time improvement of 10–30x compared to LLVM -O0 with only moderate run-time slowdowns of 1–3x, resulting in an overall improvement of 2x in a JIT-compilation-based database setting.

References

  1. 2009. CoreMark Benchmark. https://www.eembc.org/coremark/ Google ScholarGoogle Scholar
  2. 2017. SPEC CPU 2017. https://www.spec.org/cpu2017/ Google ScholarGoogle Scholar
  3. 2022. GNU lightning. https://www.gnu.org/software/lightning/manual/lightning.html Google ScholarGoogle Scholar
  4. 2023. Flang. https://github.com/llvm/llvm-project/tree/main/flang/ Google ScholarGoogle Scholar
  5. 2023. SpiderMonkey. https://spidermonkey.dev/ Google ScholarGoogle Scholar
  6. Joel Auslander, Matthai Philipose, Craig Chambers, Susan J. Eggers, and Brian N. Bershad. 1996. Fast, Effective Dynamic Compilation. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 149–159. https://doi.org/10.1145/231379.231409 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Clemens Backes. 2018. Liftoff: a new baseline compiler for WebAssembly in V8. https://v8.dev/blog/liftoff Google ScholarGoogle Scholar
  8. Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In USENIX annual technical conference, FREENIX Track. 46. Google ScholarGoogle Scholar
  9. Fabrice Bellard. 2009. Tiny Code Generator. https://github.com/qemu/qemu/blob/v4.2.0/tcg/README Google ScholarGoogle Scholar
  10. Aart Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, and Fredrik Kjolstad. 2022. Compiler Support for Sparse Tensor Computations in MLIR. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/3544559 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kevin Casey, David Gregg, M Anton Ertl, and Andrew Nisbet. 2003. Towards Superinstructions for Java Interpreters. In Software and Compilers for Embedded Systems: 7th International Workshop, SCOPES 2003, Vienna, Austria, September 24-26, 2003. Proceedings 7. 329–343. https://doi.org/10.1007/978-3-540-39920-9_23 Google ScholarGoogle ScholarCross RefCross Ref
  12. Charles Consel and François Noël. 1996. A General Approach for Run-Time Specialization and Its Application to C. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 145–156. https://doi.org/10.1145/237721.237767 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Patrick Damme, Marius Birkenbach, Constantinos Bitsakos, Matthias Boehm, Philippe Bonnet, Florina Ciorba, Mark Dokter, Pawel Dowgiallo, Ahmed Eleliemy, Christian Färber, Georgios Goumas, Dirk Habich, Niclas Hedam, Marlies Hofer, Wenjun Huang, Kevin Innerebner, Vasileios Karakostas, Roman Kern, Tomaž Kosar, and Xiao Zhu. 2022. DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines. In Conference on Innovative Data Systems Research. Google ScholarGoogle Scholar
  14. Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Åke Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL server’s memory-optimized OLTP engine. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 1243–1254. https://doi.org/10.1145/2463676.2463710 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Florian Drescher and Alexis Engelke. 2024. Artifact for CC’24 paper on Fast Template-Based Code Generation for MLIR. https://doi.org/10.5281/zenodo.10571103 Google ScholarGoogle ScholarCross RefCross Ref
  16. Dawson R. Engler. 1996. VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation. 160–170. https://doi.org/10.1145/231379.231411 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dawson R. Engler, Wilson C. Hsieh, and M. Frans Kaashoek. 1996. ’C: A Language for High-Level, Efficient, and Machine-Independent Dynamic Code Generation. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 131–144. https://doi.org/10.1145/237721.237765 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nathaniel Wesley Filardo. 2007. Porting QEMU to plan 9: QEMU internals and port strategy. https://www.contrib.andrew.cmu.edu/user/nwf/paper-strategy.pdf Google ScholarGoogle Scholar
  19. Google. 2023. What is V8? https://v8.dev Google ScholarGoogle Scholar
  20. Tian Jin, Gheorghe-Teodor Bercea, Tung D Le, Tong Chen, Gong Su, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O’Brien, and Kiyokuni Kawachiya. 2020. Compiling ONNX Neural Network Models using MLIR. arXiv:2008.08272. Google ScholarGoogle Scholar
  21. Michael Jungmair, André Kohn, and Jana Giceva. 2022. Designing an Open Framework for Query Optimization and Compilation. Proc. VLDB Endow., https://doi.org/10.14778/3551793.3551801 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Minhaj Ahmad Khan, H-P Charles, and Denis Barthou. 2007. An effective automated Approach to Specialization of Code. In International Workshop on Languages and Compilers for Parallel Computing. 308–322. https://doi.org/10.1007/978-3-540-85261-2_21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Petr Kobalicek. 2014. AsmJIT Project. https://asmjit.com/ Google ScholarGoogle Scholar
  24. Marcel Kornacker, Alexander Behm, Victor Bittorf, Taras Bobrovytsky, Casey Ching, Alan Choi, Justin Erickson, Martin Grund, Daniel Hecht, Matthew Jacobs, Ishaan Joshi, Lenni Kuff, Dileep Kumar, Alex Leblang, Nong Li, Ippokratis Pandis, Henry Robinson, David Rorke, Silvius Rus, John Russell, Dimitris Tsirogiannis, Skye Wanderman-Milne, and Michael Yoder. 2015. Impala: A Modern, Open-Source SQL Engine for Hadoop. In Conference on Innovative Data Systems Research. Google ScholarGoogle Scholar
  25. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization (CGO). https://doi.org/10.1109/CGO.2004.1281665 Google ScholarGoogle ScholarCross RefCross Ref
  26. Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. https://doi.org/10.1109/CGO51591.2021.9370308 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hsin-I Cindy Liu, Marius Brehler, Mahesh Ravishankar, Nicolas Vasilache, Ben Vanik, and Stella Laurenzo. 2022. TinyIREE: An ML Execution Environment for Embedded Systems From Compilation to Deployment. IEEE Micro, 9–16. https://doi.org/10.1109/MM.2022.3178068 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Prashanth Menon, Andrew Pavlo, and Todd C. Mowry. 2017. Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last. Proceedings of the VLDB Endowment, 1–13. https://doi.org/10.14778/3151113.3151114 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. William S. Moses, Lorenzo Chelini, Ruizhe Zhao, and Oleksandr Zinenko. 2021. Polygeist: Raising C to Polyhedral MLIR. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 45–59. https://doi.org/10.1109/PACT52795.2021.00011 Google ScholarGoogle ScholarCross RefCross Ref
  30. William S. Moses, Ivan R. Ivanov, Jens Domke, Toshio Endo, Johannes Doerfert, and Oleksandr Zinenko. 2023. High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 119–134. https://doi.org/10.1145/3572848.3577475 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. In Conference on Innovative Data Systems Research. https://api.semanticscholar.org/CorpusID:209379505 Google ScholarGoogle Scholar
  32. F. Noel, L. Hornof, C. Consel, and J.L. Lawall. 1998. Automatic, Template-based Runtime Specialization: Implementation and Experimental Study. In Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225). 132–142. https://doi.org/10.1109/ICCL.1998.674164 Google ScholarGoogle ScholarCross RefCross Ref
  33. Mike Pall. 1999. DynASM. https://luajit.org/dynasm.html Google ScholarGoogle Scholar
  34. Filip Pizlo. 2020. Speculation in JavaScriptCore. https://webkit.org/blog/10308/speculation-in-javascriptcore/ Google ScholarGoogle Scholar
  35. Massimiliano Poletto, Dawson R. Engler, and M. Frans Kaashoek. 1997. Tcc: A System for Fast, Flexible, and High-Level Dynamic Code Generation. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation. 109–121. https://doi.org/10.1145/258915.258926 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L.-N. Pouchet and T. Yuki. 2015. PolyBench: The Polyhedral Benchmarking suite. https://web.cs.ucla.edu/~pouchet/software/polybench/ Google ScholarGoogle Scholar
  37. Todd A. Proebsting. 1995. Optimizing an ANSI C Interpreter with Superoperators. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 322–332. https://doi.org/10.1145/199448.199526 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Gerald Sussman and Guy Steele. 1998. Scheme: A Interpreter for Extended Lambda Calculus. Higher-Order and Symbolic Computation, 12, 405–439. https://doi.org/10.1023/A:1010035624696 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Transaction Processing Performance Council. 2023. TPC Benchmark H. Transaction Processing Performance Council. Google ScholarGoogle Scholar
  40. Nicolas Vasilache, Oleksandr Zinenko, Aart J. C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, and Albert Cohen. 2022. Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction. CoRR. Google ScholarGoogle Scholar
  41. Christian Wimmer, Michael Haupt, Michael L. Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. 2013. Maxine: An Approachable Virtual Machine for, and in, Java. ACM Trans. Archit. Code Optim., https://doi.org/10.1145/2400682.2400689 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Haoran Xu and Fredrik Kjolstad. 2021. Copy-and-Patch Compilation: A Fast Compilation Algorithm for High-Level Languages and Bytecode. Proc. ACM Program. Lang., https://doi.org/10.1145/3485513 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast Template-Based Code Generation for MLIR

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction
        February 2024
        261 pages
        ISBN:9798400705076
        DOI:10.1145/3640537

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 February 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)282
        • Downloads (Last 6 weeks)42

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader