skip to main content
10.1145/3624309.3624314acmotherconferencesArticle/Chapter ViewAbstractPublication PagessblpConference Proceedingsconference-collections
research-article

GPotion: An embedded DSL for GPU programming in Elixir

Published:02 November 2023Publication History

ABSTRACT

This paper presents GPotion, a DSL for GPU programming embedded in the Elixir functional language. GPotion allows programmers to write low-level GPU kernels, similar to CUDA kernels, in Elixir but also provides high-level facilities like, garbage collection, type inference and simplified data transfer. Preliminary experiments demonstrate that GPotion allows fast and efficient kernels with little overhead in comparison to pure CUDA. GPotion is implemented using metaprogramming features of Elixir, without having to modify Elixir’s compiler. The source code for GPotion and the benchmarks used in the experiments are available in a GitHub repository1.

References

  1. 2021. OpenACC Programming and Best Practices Guide May 2021. WWW page, https://www.openacc.org/sites/default/files/inline-files/OpenACC_Programming_Guide_0_0.pdf.Google ScholarGoogle Scholar
  2. 2023. CUDA Toolkit Documentation 12.1 Update 1. WWW page, https://docs.nvidia.com/cuda/.Google ScholarGoogle Scholar
  3. 2023. Introduction to HIP Programming Guide. WWW page, https://docs.amd.com/bundle/HIP-Programming-Guide-v5.3/.Google ScholarGoogle Scholar
  4. 2023. OpenCL. WWW page, https://www.opencl.org/.Google ScholarGoogle Scholar
  5. 2023. The Akka Framework. WWW page, https://akka.io/.Google ScholarGoogle Scholar
  6. 2023. THe CBLAS library. WWW page, https://www.gnu.org/software/gsl/doc/html/cblas.html.Google ScholarGoogle Scholar
  7. 2023. The Elixir Language. WWW page, https://elixir-lang.org/.Google ScholarGoogle Scholar
  8. 2023. The Erlang language. WWW page, https://www.erlang.org.Google ScholarGoogle Scholar
  9. 2023. THe Matrex library. WWW page, https://hexdocs.pm/matrex/Matrex.html.Google ScholarGoogle Scholar
  10. 2023. The NIFs library. WWW page, https://www.erlang.org/doc/man/erl_nif.html.Google ScholarGoogle Scholar
  11. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. Tensorflow: a system for large-scale machine learning.. In Osdi, Vol. 16. Savannah, GA, USA, 265–283.Google ScholarGoogle Scholar
  12. Joe Armstrong. 2003. Making reliable distributed systems in the presence of software errors. Ph. D. Dissertation. Royal Institute of Technology, Stockholm, Sweden.Google ScholarGoogle Scholar
  13. Tim Besard, Christophe Foket, and Bjorn De Sutter. 2019. Effective Extensible Programming: Unleashing Julia on GPUs. IEEE Transactions on Parallel and Distributed Systems 30, 4 (2019), 827–841. https://doi.org/10.1109/TPDS.2018.2872064Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Carl Camilleri, Joseph G. Vella, and Vitezslav Nezval. 2023. Actor Model Frameworks: An Empirical Performance Analysis. In Key Digital Trends Shaping the Future of Information and Management Science, Lalit Garg, Dilip Singh Sisodia, Nishtha Kesswani, Joseph G. Vella, Imene Brigui, Sanjay Misra, and Deepak Singh (Eds.). Springer International Publishing, Cham, 461–472.Google ScholarGoogle Scholar
  15. Bryan Catanzaro, Michael Garland, and Kurt Keutzer. 2011. Copperhead: Compiling an Embedded Data Parallel Language. SIGPLAN Not. 46, 8 (feb 2011), 47–56. https://doi.org/10.1145/2038037.1941562Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Manuel M.T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In Proceedings of the Sixth Workshop on Declarative Aspects of Multicore Programming (Austin, Texas, USA) (DAMP ’11). Association for Computing Machinery, New York, NY, USA, 3–14. https://doi.org/10.1145/1926354.1926358Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dominik Charousset, Raphael Hiesgen, and Thomas C. Schmidt. 2014. CAF - the C++ Actor Framework for Scalable and Resource-Efficient Applications. In Proceedings of the 4th International Workshop on Programming Based on Actors Agents & Decentralized Control (Portland, Oregon, USA) (AGERE! ’14). Association for Computing Machinery, New York, NY, USA, 15–28. https://doi.org/10.1145/2687357.2687363Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Christophe Dubach, Perry Cheng, Rodric Rabbah, David F. Bacon, and Stephen J. Fink. 2012. Compiling a High-Level Language for GPUs: (Via Language Support for Architectures and Compilers). SIGPLAN Not. 47, 6 (jun 2012), 1–12. https://doi.org/10.1145/2345156.2254066Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tianyi David Han and Tarek S. Abdelrahman. 2011. hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems 22, 1 (2011), 78–90. https://doi.org/10.1109/TPDS.2010.62Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Paul Harvey, Kristian Hentschel, and Joseph Sventek. 2015. Parallel Programming in Actor-Based Applications via OpenCL. In Proceedings of the 16th Annual Middleware Conference (Vancouver, BC, Canada) (Middleware ’15). Association for Computing Machinery, New York, NY, USA, 162–172. https://doi.org/10.1145/2814576.2814732Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: Purely Functional GPU-Programming with Nested Parallelism and in-Place Array Updates. SIGPLAN Not. 52, 6 (jun 2017), 556–571. https://doi.org/10.1145/3140587.3062354Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Carl Hewitt, Peter Bishop, and Richard Steiger. 1973. A Universal Modular ACTOR Formalism for Artificial Intelligence. In Proceedings of the 3rd International Joint Conference on Artificial Intelligence (Stanford, USA) (IJCAI’73). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 235–245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Raphael Hiesgen, Dominik Charousset, and Thomas C. Schmidt. 2015. Manyfold Actors: Extending the C++ Actor Framework to Heterogeneous Many-Core Machines Using OpenCL. In Proceedings of the 5th International Workshop on Programming Based on Actors, Agents, and Decentralized Control (Pittsburgh, PA, USA) (AGERE! 2015). Association for Computing Machinery, New York, NY, USA, 45–56. https://doi.org/10.1145/2824815.2824820Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Raphael Hiesgen, Dominik Charousset, and Thomas C. Schmidt. 2018. OpenCL Actors – Adding Data Parallelism to Actor-Based Programming with CAF. In Lecture Notes in Computer Science. Springer International Publishing, 59–93. https://doi.org/10.1007/978-3-030-00302-9_3Google ScholarGoogle ScholarCross RefCross Ref
  25. Pieter Hijma, Stijn Heldens, Alessio Sclocco, Ben van Werkhoven, and Henri E. Bal. 2023. Optimization Techniques for GPU Programming. ACM Comput. Surv. 55, 11, Article 239 (mar 2023), 81 pages. https://doi.org/10.1145/3570638Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Eric Holk, Milinda Pathirage, Arun Chauhan, Andrew Lumsdaine, and Nicholas D. Matsakis. 2013. GPU Programming in Rust: Implementing High-Level Abstractions in a Systems-Level Language. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. 315–324. https://doi.org/10.1109/IPDPSW.2013.173Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. John Högberg. 2020. A brief introduction to BEAM. WWW page, https://www.erlang.org/blog/a-brief-beam-primer/.Google ScholarGoogle Scholar
  28. Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, and Vivek Sarkar. 2015. Compiling and optimizing java 8 programs for gpu execution. In 2015 International Conference on Parallel Architecture and Compilation (PACT). IEEE, 419–431.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-Based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (Austin, Texas) (LLVM ’15). Association for Computing Machinery, New York, NY, USA, Article 7, 6 pages. https://doi.org/10.1145/2833157.2833162Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Geoffrey Mainland and Greg Morrisett. 2010. Nikola: Embedding Compiled GPU Functions in Haskell. In Proceedings of the Third ACM Haskell Symposium on Haskell (Baltimore, Maryland, USA) (Haskell ’10). Association for Computing Machinery, New York, NY, USA, 67–78. https://doi.org/10.1145/1863523.1863533Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Trevor L. McDonell, Manuel M.T. Chakravarty, Gabriele Keller, and Ben Lippmeier. 2013. Optimising Purely Functional GPU Programs. SIGPLAN Not. 48, 9 (sep 2013), 49–60. https://doi.org/10.1145/2544174.2500595Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton. 2015. Type-Safe Runtime Code Generation: Accelerate to LLVM. SIGPLAN Not. 50, 12 (aug 2015), 201–212. https://doi.org/10.1145/2887747.2804313Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Richard Membarth, Oliver Reiche, Frank Hannig, Jürgen Teich, Mario Körner, and Wieland Eckert. 2016. HIPAcc: A Domain-Specific Language and Compiler for Image Processing. IEEE Transactions on Parallel and Distributed Systems 27, 1 (2016), 210–224. https://doi.org/10.1109/TPDS.2015.2394802Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Thomas Nelson. 2022. Introducing Microsoft Orleans. In Introducing Microsoft Orleans: Implementing Cloud-Native Services with a Virtual Actor Framework. Springer, 17–27.Google ScholarGoogle Scholar
  35. ROYUD Nishino and Shohei Hido Crissman Loomis. 2017. Cupy: A numpy-compatible library for nvidia gpu calculations. 31st confernce on neural information processing systems 151, 7 (2017).Google ScholarGoogle Scholar
  36. NVIDIA. 2023. Fundamentals of Accelerated Computing with CUDA C/C++. Online Course NVIDIA Deep Learning Institute, https://courses.nvidia.com/courses/course-v1:DLI+C-AC-01+V1/.Google ScholarGoogle Scholar
  37. OpenMP Architecture Review Board. 2023. OpenMP Application Programming Interface, Version 5.0 November 2018. WWW page, https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf.Google ScholarGoogle Scholar
  38. Philip C. Pratt-Szeliga, James W. Fawcett, and Roy D. Welch. 2012. Rootbeer: Seamlessly Using GPUs from Java. In 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems. 375–380. https://doi.org/10.1109/HPCC.2012.57Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Dinei A. Rockenbach, Júnior Löff, Gabriell Araujo, Dalvan Griebler, and Luiz Gustavo Fernandes. 2022. High-Level Stream and Data Parallelism in C++ for GPUs. In Proceedings of the XXVI Brazilian Symposium on Programming Languages (Virtual Event, Brazil) (SBLP ’22). Association for Computing Machinery, New York, NY, USA, 41–49. https://doi.org/10.1145/3561320.3561327Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Alex Rubinsteyn, Eric Hielscher, Nathaniel Weinman, and Dennis Shasha. 2012. Parakeet: A Just-in-Time Parallel Accelerator for Python. In Proceedings of the 4th USENIX Conference on Hot Topics in Parallelism (Berkeley, CA) (HotPar’12). USENIX Association, USA, 14.Google ScholarGoogle Scholar
  41. Jason Sanders and Edward Kandrot. 2010. CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Satish Narayana Srirama, Freddy Marcelo Surriabre Dick, and Mainak Adhikari. 2021. Akka framework based on the Actor model for executing distributed Fog Computing applications. Future Generation Computer Systems 117 (2021), 439–452. https://doi.org/10.1016/j.future.2020.12.011Google ScholarGoogle ScholarCross RefCross Ref
  43. Satish Narayana Srirama and Deepika Vemuri. 2023. CANTO: An actor model-based distributed fog framework supporting neural networks training in IoT applications. Computer Communications 199 (2023), 1–9. https://doi.org/10.1016/j.comcom.2022.12.007Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Andrew Stromme, Ryan Carlson, and Tia Newhall. 2012. Chestnut: A GPU Programming Language for Non-Experts. In Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores (New Orleans, Louisiana) (PMAM ’12). Association for Computing Machinery, New York, NY, USA, 156–167. https://doi.org/10.1145/2141702.2141720Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Joel Svensson, Koen Claessen, and Mary Sheeran. 2010. GPGPU kernel implementation and refinement using Obsidian. Procedia Computer Science 1, 1 (2010), 2065–2074. https://doi.org/10.1016/j.procs.2010.04.231 ICCS 2010.Google ScholarGoogle ScholarCross RefCross Ref
  46. Ruomeng (Cocoa) Xu, Anna Lito Michala, and Phil Trinder. 2022. CAEFL: Composable and Environment Aware Federated Learning Models. In Proceedings of the 21st ACM SIGPLAN International Workshop on Erlang (Ljubljana, Slovenia) (Erlang 2022). Association for Computing Machinery, New York, NY, USA, 9–20. https://doi.org/10.1145/3546186.3549927Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yonghong Yan, Max Grossman, and Vivek Sarkar. 2009. JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA. In Euro-Par 2009 Parallel Processing, Henk Sips, Dick Epema, and Hai-Xiang Lin (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 887–899.Google ScholarGoogle Scholar

Index Terms

  1. GPotion: An embedded DSL for GPU programming in Elixir

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming Languages
      September 2023
      110 pages
      ISBN:9798400716287
      DOI:10.1145/3624309

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate22of50submissions,44%
    • Article Metrics

      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format