Abstract
The performance of machine learning systems heavily relies on code generators tailored to tensor computations. We propose an approach to the design and implementation of such code generators leveraging the natural structure of tensor algebra and illustrating the progressive lowering of domain-specific abstractions in the MLIR infrastructure.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
This is analogous to the design of struct in LLVM IR: %1 = insertvalue {f64, f32, i32} %0, f32 42.0, 1 defines a new value %1 that holds the same elements as %0 except for the element at position 1 that now holds 42.0.
- 2.
The operation also allows specifying sizes and strides, omitted for simplicity.
- 3.
Some transformations such as software pipelining remain naturally attached to loops.
References
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann Publishers (2001)
Chen, T., et al.: TVM: an automated end-to-end optimizing compiler for deep learning. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 578–594. USENIX Association (2018). https://www.usenix.org/conference/osdi18/presentation/chen
Developers, I.: IREE (intermediate representation execution environment (2021). https://google.github.io/iree/
Hagedorn, B., Elliott, A.S., Barthels, H., Bodik, R., Grover, V.: Fireiron: a data-movement-aware scheduling language for gpus. In: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020, pp. 71–82. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3410463.3414632
Hagedorn, B., Lenfers, J., Koehler, T., Gorlatch, S., Steuwer, M.: A language for describing optimization strategies. CoRR abs/2002.02268 (2020). https://arxiv.org/abs/2002.02268
Hagedorn, B., Lenfers, J., Kundefinedhler, T., Qin, X., Gorlatch, S., Steuwer, M.: Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies. Proceedings of ACM on Programming Languages 4(ICFP) (Aug 2020). https://doi.org/10.1145/3408974
Kjolstad, F., Kamil, S., Chou, S., Lugato, D., Amarasinghe, S.: The tensor algebra compiler. Proc. ACM Program. Lang. 1(OOPSLA) (Oct 2017). https://doi.org/10.1145/3133901
Lattner, C., et al.: Mlir: scaling compiler infrastructure for domain specific computation. In: 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 2–14. IEEE/ACM, IEEE/ACM (2021). https://doi.org/10.1109/CGO51591.2021.9370308
Mullapudi, R.T., Vasista, V., Bondhugula, U.: Polymage: automatic optimization for image processing pipelines. In: Özturk, Ö., Ebcioglu, K., Dwarkadas, S. (eds.) Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2015, Istanbul, Turkey, March 14–18, 2015, pp. 429–443. ACM (2015). https://doi.org/10.1145/2694344.2694364
Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., Amarasinghe, S.: Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48(6), 519–530 (2013). https://doi.org/10.1145/2499370.2462176
Rasch, A., Schulze, R., Gorlatch, S.: Generating portable high-performance code via multi-dimensional homomorphisms. In: 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 354–369. IEEE, Seattle, WA (2019). https://doi.org/10.1109/PACT.2019.00035
Rotem, N., et al.: Glow: graph lowering compiler techniques for neural networks. CoRR abs/1805.00907 (2018). https://arxiv.org/abs/1805.00907
Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons (1986)
Smith, G.H., et al.: Pure tensor program rewriting via access patterns (representation pearl). CoRR abs/2105.09377 (2021). https://arxiv.org/abs/2105.09377
Steuwer, M., Remmelg, T., Dubach, C.: Lift: a functional data-parallel ir for high-performance gpu code generation. In: 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 74–85. IEEE/ACM (2017). https://doi.org/10.1109/CGO.2017.7863730
Tillet, P., Kung, H.T., Cox, D.: Triton: an intermediate language and compiler for tiled neural network computations. In: Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, MAPL 2019, pp. 10–19. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3315508.3329973
Vasilache, N., et al.: Composable and modular code generation in MLIR: a structured and retargetable approach to tensor compiler construction. CoRR abs/2202.03293 (2022). https://arxiv.org/abs/2202.03293
Vasilache, N., et al.: The next 700 accelerated layers: from mathematical expressions of network computation graphs to accelerated gpu kernels, automatically. ACM Trans. Architecture Code Optim. (TACO) 16(4), 1–26 (2019). https://doi.org/10.1145/3355606
XLA team within Google: XLA: TensorFlow, Compiled. Google Developers Blog (2017). https://developers.googleblog.com/2017/03/xla-tensorflow-compiled.html
Zheng, L., et al.: Ansor: generating high-performance tensor programs for deep learning. In: 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, 4–6 November, 2020, pp. 863–879. USENIX Association (2020). https://www.usenix.org/conference/osdi20/presentation/zheng
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vasilache, N. et al. (2023). Structured Operations: Modular Design of Code Generators for Tensor Compilers. In: Mendis, C., Rauchwerger, L. (eds) Languages and Compilers for Parallel Computing. LCPC 2022. Lecture Notes in Computer Science, vol 13829. Springer, Cham. https://doi.org/10.1007/978-3-031-31445-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-31445-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31444-5
Online ISBN: 978-3-031-31445-2
eBook Packages: Computer ScienceComputer Science (R0)