- Sponsor:
- sigplan
No abstract available.
Proceeding Downloads
Meta-programming and auto-tuning in the search for high performance GPU code
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring laborious manual tuning of low-level details. Despite these challenges, the cost in ignoring GPUs in high performance computing is increasingly large. Auto-...
Converting data-parallelism to task-parallelism by rewrites: purely functional programs across multiple GPUs
High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. ...
Functional array streams
Regular array languages for high performance computing based on aggregate operations provide a convenient parallel programming model, which enables the generation of efficient code for SIMD architectures, such as GPUs. However, the data sets that can ...
Skeletons for distributed topological computation
Parallel implementation of topological algorithms is highly desirable, but the challenges, from reconstructing algorithms around independent threads through to runtime load balancing, have proven to be formidable. This problem, made all the more acute ...
Generate and offshore: type-safe and modular code generation for low-level optimization
We present the Asuna system which supports implicitly heterogeneous multi-stage programming based on MetaOCaml, a multi-stage extension of OCaml. Our system allows programmers to write code generators in a high-level language, and generated code can be ...
Scalan: a framework for domain-specific hotspot optimization (invited tutorial)
While high-level abstractions greatly simplify program development, they ultimately need to be eliminated to produce high-performance code. This can be done using generative programming; one particularly usable approach is Lightweight Modular Staging. ...
Cited By
Index Terms
Proceedings of the 4th ACM SIGPLAN Workshop on Functional High-Performance Computing