research-article

GPotion: An embedded DSL for GPU programming in Elixir

Authors:
Andre Rauber Du Bois

PPGC, Federal University of Pelotas, Brazil

PPGC, Federal University of Pelotas, Brazil

0000-0002-6790-5184
View Profile

,
Gerson Cavalheiro

PPGC, Federal University of Pelotas, Brazil

PPGC, Federal University of Pelotas, Brazil

0000-0002-4314-3429
View Profile

SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming LanguagesSeptember 2023Pages 1–8https://doi.org/10.1145/3624309.3624314

Published:02 November 2023Publication History

SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming Languages

Pages 1–8

ABSTRACT

This paper presents GPotion, a DSL for GPU programming embedded in the Elixir functional language. GPotion allows programmers to write low-level GPU kernels, similar to CUDA kernels, in Elixir but also provides high-level facilities like, garbage collection, type inference and simplified data transfer. Preliminary experiments demonstrate that GPotion allows fast and efficient kernels with little overhead in comparison to pure CUDA. GPotion is implemented using metaprogramming features of Elixir, without having to modify Elixir’s compiler. The source code for GPotion and the benchmarks used in the experiments are available in a GitHub repository1.

References

2021. OpenACC Programming and Best Practices Guide May 2021. WWW page, https://www.openacc.org/sites/default/files/inline-files/OpenACC_Programming_Guide_0_0.pdf.Google Scholar
2023. CUDA Toolkit Documentation 12.1 Update 1. WWW page, https://docs.nvidia.com/cuda/.Google Scholar
2023. Introduction to HIP Programming Guide. WWW page, https://docs.amd.com/bundle/HIP-Programming-Guide-v5.3/.Google Scholar
2023. OpenCL. WWW page, https://www.opencl.org/.Google Scholar
2023. The Akka Framework. WWW page, https://akka.io/.Google Scholar
2023. THe CBLAS library. WWW page, https://www.gnu.org/software/gsl/doc/html/cblas.html.Google Scholar
2023. The Elixir Language. WWW page, https://elixir-lang.org/.Google Scholar
2023. The Erlang language. WWW page, https://www.erlang.org.Google Scholar
2023. THe Matrex library. WWW page, https://hexdocs.pm/matrex/Matrex.html.Google Scholar
2023. The NIFs library. WWW page, https://www.erlang.org/doc/man/erl_nif.html.Google Scholar
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. Tensorflow: a system for large-scale machine learning.. In Osdi, Vol. 16. Savannah, GA, USA, 265–283.Google Scholar
Joe Armstrong. 2003. Making reliable distributed systems in the presence of software errors. Ph. D. Dissertation. Royal Institute of Technology, Stockholm, Sweden.Google Scholar
Tim Besard, Christophe Foket, and Bjorn De Sutter. 2019. Effective Extensible Programming: Unleashing Julia on GPUs. IEEE Transactions on Parallel and Distributed Systems 30, 4 (2019), 827–841. https://doi.org/10.1109/TPDS.2018.2872064Google ScholarDigital Library
Carl Camilleri, Joseph G. Vella, and Vitezslav Nezval. 2023. Actor Model Frameworks: An Empirical Performance Analysis. In Key Digital Trends Shaping the Future of Information and Management Science, Lalit Garg, Dilip Singh Sisodia, Nishtha Kesswani, Joseph G. Vella, Imene Brigui, Sanjay Misra, and Deepak Singh (Eds.). Springer International Publishing, Cham, 461–472.Google Scholar
Bryan Catanzaro, Michael Garland, and Kurt Keutzer. 2011. Copperhead: Compiling an Embedded Data Parallel Language. SIGPLAN Not. 46, 8 (feb 2011), 47–56. https://doi.org/10.1145/2038037.1941562Google ScholarDigital Library
Manuel M.T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In Proceedings of the Sixth Workshop on Declarative Aspects of Multicore Programming (Austin, Texas, USA) (DAMP ’11). Association for Computing Machinery, New York, NY, USA, 3–14. https://doi.org/10.1145/1926354.1926358Google ScholarDigital Library
Dominik Charousset, Raphael Hiesgen, and Thomas C. Schmidt. 2014. CAF - the C++ Actor Framework for Scalable and Resource-Efficient Applications. In Proceedings of the 4th International Workshop on Programming Based on Actors Agents & Decentralized Control (Portland, Oregon, USA) (AGERE! ’14). Association for Computing Machinery, New York, NY, USA, 15–28. https://doi.org/10.1145/2687357.2687363Google ScholarDigital Library
Christophe Dubach, Perry Cheng, Rodric Rabbah, David F. Bacon, and Stephen J. Fink. 2012. Compiling a High-Level Language for GPUs: (Via Language Support for Architectures and Compilers). SIGPLAN Not. 47, 6 (jun 2012), 1–12. https://doi.org/10.1145/2345156.2254066Google ScholarDigital Library
Tianyi David Han and Tarek S. Abdelrahman. 2011. hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems 22, 1 (2011), 78–90. https://doi.org/10.1109/TPDS.2010.62Google ScholarDigital Library
Paul Harvey, Kristian Hentschel, and Joseph Sventek. 2015. Parallel Programming in Actor-Based Applications via OpenCL. In Proceedings of the 16th Annual Middleware Conference (Vancouver, BC, Canada) (Middleware ’15). Association for Computing Machinery, New York, NY, USA, 162–172. https://doi.org/10.1145/2814576.2814732Google ScholarDigital Library
Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: Purely Functional GPU-Programming with Nested Parallelism and in-Place Array Updates. SIGPLAN Not. 52, 6 (jun 2017), 556–571. https://doi.org/10.1145/3140587.3062354Google ScholarDigital Library
Carl Hewitt, Peter Bishop, and Richard Steiger. 1973. A Universal Modular ACTOR Formalism for Artificial Intelligence. In Proceedings of the 3rd International Joint Conference on Artificial Intelligence (Stanford, USA) (IJCAI’73). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 235–245.Google ScholarDigital Library
Raphael Hiesgen, Dominik Charousset, and Thomas C. Schmidt. 2015. Manyfold Actors: Extending the C++ Actor Framework to Heterogeneous Many-Core Machines Using OpenCL. In Proceedings of the 5th International Workshop on Programming Based on Actors, Agents, and Decentralized Control (Pittsburgh, PA, USA) (AGERE! 2015). Association for Computing Machinery, New York, NY, USA, 45–56. https://doi.org/10.1145/2824815.2824820Google ScholarDigital Library
Raphael Hiesgen, Dominik Charousset, and Thomas C. Schmidt. 2018. OpenCL Actors – Adding Data Parallelism to Actor-Based Programming with CAF. In Lecture Notes in Computer Science. Springer International Publishing, 59–93. https://doi.org/10.1007/978-3-030-00302-9_3Google ScholarCross Ref
Pieter Hijma, Stijn Heldens, Alessio Sclocco, Ben van Werkhoven, and Henri E. Bal. 2023. Optimization Techniques for GPU Programming. ACM Comput. Surv. 55, 11, Article 239 (mar 2023), 81 pages. https://doi.org/10.1145/3570638Google ScholarDigital Library
Eric Holk, Milinda Pathirage, Arun Chauhan, Andrew Lumsdaine, and Nicholas D. Matsakis. 2013. GPU Programming in Rust: Implementing High-Level Abstractions in a Systems-Level Language. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. 315–324. https://doi.org/10.1109/IPDPSW.2013.173Google ScholarDigital Library
John Högberg. 2020. A brief introduction to BEAM. WWW page, https://www.erlang.org/blog/a-brief-beam-primer/.Google Scholar
Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, and Vivek Sarkar. 2015. Compiling and optimizing java 8 programs for gpu execution. In 2015 International Conference on Parallel Architecture and Compilation (PACT). IEEE, 419–431.Google ScholarDigital Library
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-Based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (Austin, Texas) (LLVM ’15). Association for Computing Machinery, New York, NY, USA, Article 7, 6 pages. https://doi.org/10.1145/2833157.2833162Google ScholarDigital Library
Geoffrey Mainland and Greg Morrisett. 2010. Nikola: Embedding Compiled GPU Functions in Haskell. In Proceedings of the Third ACM Haskell Symposium on Haskell (Baltimore, Maryland, USA) (Haskell ’10). Association for Computing Machinery, New York, NY, USA, 67–78. https://doi.org/10.1145/1863523.1863533Google ScholarDigital Library
Trevor L. McDonell, Manuel M.T. Chakravarty, Gabriele Keller, and Ben Lippmeier. 2013. Optimising Purely Functional GPU Programs. SIGPLAN Not. 48, 9 (sep 2013), 49–60. https://doi.org/10.1145/2544174.2500595Google ScholarDigital Library
Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton. 2015. Type-Safe Runtime Code Generation: Accelerate to LLVM. SIGPLAN Not. 50, 12 (aug 2015), 201–212. https://doi.org/10.1145/2887747.2804313Google ScholarDigital Library
Richard Membarth, Oliver Reiche, Frank Hannig, Jürgen Teich, Mario Körner, and Wieland Eckert. 2016. HIPAcc: A Domain-Specific Language and Compiler for Image Processing. IEEE Transactions on Parallel and Distributed Systems 27, 1 (2016), 210–224. https://doi.org/10.1109/TPDS.2015.2394802Google ScholarDigital Library
Thomas Nelson. 2022. Introducing Microsoft Orleans. In Introducing Microsoft Orleans: Implementing Cloud-Native Services with a Virtual Actor Framework. Springer, 17–27.Google Scholar
ROYUD Nishino and Shohei Hido Crissman Loomis. 2017. Cupy: A numpy-compatible library for nvidia gpu calculations. 31st confernce on neural information processing systems 151, 7 (2017).Google Scholar
NVIDIA. 2023. Fundamentals of Accelerated Computing with CUDA C/C++. Online Course NVIDIA Deep Learning Institute, https://courses.nvidia.com/courses/course-v1:DLI+C-AC-01+V1/.Google Scholar
OpenMP Architecture Review Board. 2023. OpenMP Application Programming Interface, Version 5.0 November 2018. WWW page, https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf.Google Scholar
Philip C. Pratt-Szeliga, James W. Fawcett, and Roy D. Welch. 2012. Rootbeer: Seamlessly Using GPUs from Java. In 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems. 375–380. https://doi.org/10.1109/HPCC.2012.57Google ScholarDigital Library
Dinei A. Rockenbach, Júnior Löff, Gabriell Araujo, Dalvan Griebler, and Luiz Gustavo Fernandes. 2022. High-Level Stream and Data Parallelism in C++ for GPUs. In Proceedings of the XXVI Brazilian Symposium on Programming Languages (Virtual Event, Brazil) (SBLP ’22). Association for Computing Machinery, New York, NY, USA, 41–49. https://doi.org/10.1145/3561320.3561327Google ScholarDigital Library
Alex Rubinsteyn, Eric Hielscher, Nathaniel Weinman, and Dennis Shasha. 2012. Parakeet: A Just-in-Time Parallel Accelerator for Python. In Proceedings of the 4th USENIX Conference on Hot Topics in Parallelism (Berkeley, CA) (HotPar’12). USENIX Association, USA, 14.Google Scholar
Jason Sanders and Edward Kandrot. 2010. CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional.Google ScholarDigital Library
Satish Narayana Srirama, Freddy Marcelo Surriabre Dick, and Mainak Adhikari. 2021. Akka framework based on the Actor model for executing distributed Fog Computing applications. Future Generation Computer Systems 117 (2021), 439–452. https://doi.org/10.1016/j.future.2020.12.011Google ScholarCross Ref
Satish Narayana Srirama and Deepika Vemuri. 2023. CANTO: An actor model-based distributed fog framework supporting neural networks training in IoT applications. Computer Communications 199 (2023), 1–9. https://doi.org/10.1016/j.comcom.2022.12.007Google ScholarDigital Library
Andrew Stromme, Ryan Carlson, and Tia Newhall. 2012. Chestnut: A GPU Programming Language for Non-Experts. In Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores (New Orleans, Louisiana) (PMAM ’12). Association for Computing Machinery, New York, NY, USA, 156–167. https://doi.org/10.1145/2141702.2141720Google ScholarDigital Library
Joel Svensson, Koen Claessen, and Mary Sheeran. 2010. GPGPU kernel implementation and refinement using Obsidian. Procedia Computer Science 1, 1 (2010), 2065–2074. https://doi.org/10.1016/j.procs.2010.04.231 ICCS 2010.Google ScholarCross Ref
Ruomeng (Cocoa) Xu, Anna Lito Michala, and Phil Trinder. 2022. CAEFL: Composable and Environment Aware Federated Learning Models. In Proceedings of the 21st ACM SIGPLAN International Workshop on Erlang (Ljubljana, Slovenia) (Erlang 2022). Association for Computing Machinery, New York, NY, USA, 9–20. https://doi.org/10.1145/3546186.3549927Google ScholarDigital Library
Yonghong Yan, Max Grossman, and Vivek Sarkar. 2009. JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA. In Euro-Par 2009 Parallel Processing, Henk Sips, Dick Epema, and Hai-Xiang Lin (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 887–899.Google Scholar

Index Terms

GPotion: An embedded DSL for GPU programming in Elixir
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
Read More
Programming Elixir: Functional | Concurrent | Pragmatic | Fun
Read More
Metaprogramming Elixir: Write Less Code, Get More Done (and Have Fun!)
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming Languages
September 2023
110 pages
ISBN:9798400716287
DOI:10.1145/3624309

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Elixir
actors model
gpu
parallel programming
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate22of50submissions,44%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 17
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

GPotion: An embedded DSL for GPU programming in Elixir

SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming Languages

ABSTRACT

References

Cited By

Index Terms

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Programming Elixir: Functional | Concurrent | Pragmatic | Fun

Metaprogramming Elixir: Write Less Code, Get More Done (and Have Fun!)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

GPotion: An embedded DSL for GPU programming in Elixir

SBLP '23: Proceedings of the XXVII Brazilian Symposium on Programming Languages

ABSTRACT

References

Cited By

Index Terms

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Programming Elixir: Functional | Concurrent | Pragmatic | Fun

Metaprogramming Elixir: Write Less Code, Get More Done (and Have Fun!)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media