research-article

Ikra-Cpp: A C++/CUDA DSL for Object-Oriented Programming with Structure-of-Arrays Layout

Authors:

Matthias Springer,

Hidehiko MasuharaAuthors Info & Claims

WPMVP'18: Proceedings of the 2018 4th Workshop on Programming Models for SIMD/Vector Processing

Article No.: 6, Pages 1 - 9

https://doi.org/10.1145/3178433.3178439

Published: 24 February 2018 Publication History

Abstract

Structure of Arrays (SOA) is a well-studied data layout technique for SIMD architectures. Previous work has shown that it can speed up applications in high-performance computing by several factors compared to a traditional Array of Structures (AOS) layout. However, most programmers are used to AOS-style programming, which is more readable and easier to maintain.

We present Ikra-Cpp, an embedded DSL for object-oriented programming in C++/CUDA. Ikra-Cpp's notation is very close to standard AOS-style C++ code, but data is layed out as SOA. This gives programmers the performance benefit of SOA and the expressiveness of AOS-style object-oriented programming at the same time. Ikra-Cpp is well integrated with C++ and lets programmers use C++ notation and syntax for classes, fields, member functions, constructors and instance creation.

References

[1]

Gilbert Louis Bernstein, Chinmayee Shah, Crystal Lemire, Zachary Devito, Matthew Fisher, Philip Levis, and Pat Hanrahan. 2016. Ebb: A DSL for Physical Simulation on CPUs and GPUs. ACM Trans. Graph. 35, 2, Article 21 (May 2016), 12 pages.

Digital Library

[2]

Paul Besl. 2015. A case study comparing AoS (Arrays of Structures) and SoA (Structures of Arrays) data layouts for a compute-intensive loop run on Intel Xeon processors and Intel Xeon Phi product family coprocessors. Technical Report. Intel Corporation.

[3]

James Brodman, Dmitry Babokin, Ilia Filippov, and Peng Tu. 2014. Writing Scalable SIMD Programs with ISPC (WPMVP '14). ACM, 25--32.

Digital Library

[4]

E. Calore, A. Gabbana, J. Kraus, E. Pellegrini, S.F. Schifano, and R. Tripiccione. 2016. Massively Parallel Lattice-Boltzmann Codes on Large GPU Clusters. Parallel Comput. 58, C (Oct. 2016), 1--24.

Digital Library

[5]

Hassan Chafi, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Anand R. Atreya, and Kunle Olukotun. 2011. A Domain-specific Approach to Heterogeneous Parallelism (PPoPP '11). ACM, 35--46.

Digital Library

[6]

James O. Coplien. 1995. Curiously Recurring Template Patterns. C++ Rep. 7, 2 (Feb. 1995), 24--27.

Digital Library

[7]

Pawan Harish and P. J. Narayanan. 2007. Accelerating Large Graph Algorithms on the GPU Using CUDA (HiPC'07). Springer-Verlag, 197--208.

Digital Library

[8]

Dirk Helbing. 2012. Agent-Based Modeling. In Social Self-Organization: Agent-Based Simulations and Experiments to Study Emergent Social Behavior. Springer-Verlag, 25--70.

[9]

Bruce Hendrickson and Jonathan W. Berry. 2008. Graph Analysis with High-Performance Computing. Computing in Science and Engg. 10, 2 (March 2008), 14--19.

Digital Library

[10]

Holger Homann and Francois Laenen. 2017. SoAx: A generic C++ Structure of Arrays for handling Particles in HPC Codes. ArXiv e-prints, to appear in Comm. Phys. Comm. (Oct. 2017).

[11]

Paul Hudak. 1998. Modular Domain Specific Languages and Tools (ICSR '98). IEEE Computer Society, 134--142.

Digital Library

[12]

ISO. 2012. ISO/IEC 14882:2011 Information technology --- Programming languages --- C++. International Organization for Standardization. 1338 (est.) pages.

[13]

Klaus Kofler, Biagio Cosenza, and Thomas Fahringer. 2015. Automatic Data Layout Optimizations for GPUs (Euro-Par 2015). Springer-Verlag, 263--274.

[14]

Roland Leißa, Sebastian Hack, and Ingo Wald. 2012. Extending a C-like Language for Portable SIMD Programming (PPoPP '12). ACM, 65--74.

[15]

Roland Leißa, Immanuel Haffner, and Sebastian Hack. 2014. Sierra: A SIMD Extension for C++ (WPMVP '14). ACM, 17--24.

Digital Library

[16]

Erik Lindholm, John Nickolls, Stuart Oberman, and John Montrym. 2008. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro 28, 2 (March 2008), 39--55.

Digital Library

[17]

Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-scale Graph Processing (SIGMOD '10). ACM, 135--146.

Digital Library

[18]

Harris Mark. 2008. Optimizing parallel reduction in CUDA. Nvidia CUDA SDK 2 (2008).

[19]

Toni Mattis, Johannes Henning, Patrick Rein, Robert Hirschfeld, and Malte Appeltauer. 2015. Columnar Objects: Improving the Performance of Analytical Applications (Onward! 2015). ACM, 197--210.

Digital Library

[20]

Gang Mei and Hong Tian. 2016. Impact of data layouts on the efficiency of GPU-accelerated IDW interpolation. SpringerPlus 5, 1 (Feb. 2016).

[21]

Marjan Mernik, Jan Heering, and Anthony M. Sloane. 2005. When and How to Develop Domain-specific Languages. ACM Comput. Surv. 37, 4 (Dec. 2005), 316--344.

Digital Library

[22]

Duane Merrill, Michael Garland, and Andrew Grimshaw. 2012. Scalable GPU Graph Traversal. SIGPLAN Not. 47, 8 (Feb. 2012), 117--128.

Digital Library

[23]

Bertrand Meyer. 1997. Object-oriented Software Construction (2nd Ed.). Prentice-Hall, Inc.

Digital Library

[24]

Perhaad Mistry, Dana Schaa, Byunghyun Jang, David Kaeli, Albert Dvornik, and Dwight Meglan. 2011. Data Structures and Transformations for Physically Based Simulation on a GPU. In High Performance Computing for Computational Science -- VECPAR 2010: 9th Int. Conference, Revised Selected Papers. Springer-Verlag, 162--171.

Digital Library

[25]

Matt Pharr and William R. Mark. 2012. ispc: A SPMD compiler for high-performance CPU programming. In Innovative Parallel Computing (InPar). IEEE, 1--13.

[26]

Viera K. Proulx. 1998. Traffic Simulation: A Case Study for Teaching Object Oriented Design (SIGCSE '98). ACM, 48--52.

Digital Library

[27]

Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. SIGPLAN Not. 48, 6 (June 2013), 519--530.

Digital Library

[28]

P. Richmond, S. Coakley, and D. M. Romano. 2009. A High Performance Agent Based Modelling Framework on Graphics Card Hardware with CUDA (AAMAS '09). International Foundation for Autonomous Agents and Multiagent Systems, 1125--1126.

Digital Library

[29]

Tiark Rompf, Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown, Hassan Chafi, Martin Odersky, and Kunle Olukotun. 2011. Building-Blocks for Performance Oriented DSLs (DSL '11). 93--117.

[30]

Alban Rousset, Bénédicte Herrmann, Christophe Lang, and Laurent Philippe. 2016. A survey on parallel and distributed multi-agent systems for high performance computing simulations. Computer Science Review 22, Supplement C (2016), 27--46.

[31]

Jakob Siegel, Juergen Ributzka, and Xiaoming Li. 2009. CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator (ICPPW '09). IEEE Computer Society, 174--181.

Digital Library

[32]

Matthias Springer and Hidehiko Masuhara. 2016. Object Support in an Array-based GPGPU Extension for Ruby (ARRAY 2016). ACM, 25--31.

Digital Library

[33]

Benedikt Stefansson. 2000. Simulating Economic Agents in Swarm. In Economic Simulations in Swarm: Agent-Based Modelling and Object Oriented Programming. Springer US, 3--61.

[34]

Bjarne Stroustrup. 2012. Foundations of C++ (ESOP 2012). Springer-Verlag, 1--25.

Digital Library

[35]

Robert Strzodka. 2012. Chapter 31 - Abstraction for AoS and SoA Layout in C++. In GPU Computing Gems Jade Edition, Wen-mei W. Hwu (Ed.). Morgan Kaufmann, 429--441.

[36]

Robert Strzodka. 2012. Data Layout Optimization for Multi-valued Containers in OpenCL. J. Parallel Distrib. Comput. 72, 9 (Sept. 2012).

Digital Library

[37]

Arvind K. Sujeeth, Kevin J. Brown, Hyoukjoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun. 2014. Delite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages. ACM Trans. Embed. Comput. Syst. 13, 4s, Article 134 (April 2014), 25 pages.

Digital Library

[38]

Nicolas Weber and Michael Goesele. 2014. Auto-tuning Complex Array Layouts for GPUs (PGV '14). Eurographics Association, 57--64.

Digital Library

[39]

Jianlong Zhong and Bingsheng He. 2013. Parallel Graph Processing on Graphics Processors Made Easy. Proc. VLDB Endow. 6, 12 (Aug. 2013), 1270--1273.

Digital Library

Cited By

Huijben RAaldering JAchten PScholz S(2024)Flattening Combinations of Arrays and RecordsTrends in Functional Programming10.1007/978-3-031-74558-4_10(220-240)Online publication date: 8-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-74558-4_10
Springer MSun YMasuhara HScholz SShivers O(2018)Inner array inlining for structure of arrays layoutProceedings of the 5th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming10.1145/3219753.3219760(50-58)Online publication date: 19-Jun-2018
https://dl.acm.org/doi/10.1145/3219753.3219760

Index Terms

Ikra-Cpp: A C++/CUDA DSL for Object-Oriented Programming with Structure-of-Arrays Layout
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Data types and structures
      2. Language types
        Object oriented languages
        Parallel programming languages

Recommendations

Inner array inlining for structure of arrays layout
ARRAY 2018: Proceedings of the 5th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

Previous work has shown how the well-studied and SIMD-friendly Structure of Arrays (SOA) data layout strategy can speed up applications in high-performance computing compared to a traditional Array of Structures (AOS) data layout. However, a standard SOA ...
What Is Object-Oriented Programming?

The meaning of the term 'object oriented' is examined in the context of the general-purpose programming language C++. This choice is made partly to introduce C++ and partly because C++ is one of the few languages that supports data abstraction, object-...
Methodology first and language second: a way to teach object-oriented programming
OOPSLA '03: Companion of the 18th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications

C++ is a very successful object-oriented language. It is a required language for more and more students. It takes great effort and practice for these students to learn how to program in C++ and how to make object-oriented programs. One potential failure ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WPMVP'18: Proceedings of the 2018 4th Workshop on Programming Models for SIMD/Vector Processing

February 2018

68 pages

ISBN:9781450356466

DOI:10.1145/3178433

Editors:
Jan Eitzinger
University of Erlangen-Nuremberg, Germany
,
James Brodman
Intel, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

PPoPP '18

Sponsor:

PPoPP '18: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 24 - 28, 2018

Vienna, Austria

Acceptance Rates

WPMVP'18 Paper Acceptance Rate 8 of 12 submissions, 67%;

Overall Acceptance Rate 20 of 30 submissions, 67%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
155
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Huijben RAaldering JAchten PScholz S(2024)Flattening Combinations of Arrays and RecordsTrends in Functional Programming10.1007/978-3-031-74558-4_10(220-240)Online publication date: 8-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-74558-4_10
Springer MSun YMasuhara HScholz SShivers O(2018)Inner array inlining for structure of arrays layoutProceedings of the 5th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming10.1145/3219753.3219760(50-58)Online publication date: 19-Jun-2018
https://dl.acm.org/doi/10.1145/3219753.3219760

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten