research-article

Object support in an array-based GPGPU extension for Ruby

Authors:

Matthias Springer,

Hidehiko MasuharaAuthors Info & Claims

ARRAY 2016: Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

Pages 25 - 31

https://doi.org/10.1145/2935323.2935327

Published: 02 June 2016 Publication History

Abstract

This paper presents implementation and optimization techniques to support objects in Ikra, an array-based parallel extension to Ruby with dynamic compilation. The high-level goal of Ikra is to allow developers to exploit GPU-based high-performance computing without paying much attention to intricate details of the underlying GPU infrastructure and CUDA. Ikra supports dynamically-typed object-oriented programming in Ruby and performs a number of optimizations. It supports parallel operations (e.g., map, each) on arrays of polymorphic objects, allowing polymorphic method calls inside a kernel by compiling them to conditional branches. To reduce branch divergence, Ikra shuffles thread assignments to base array elements based on runtime types of elements. To facilitate memory coalescing, Ikra stores objects in a structure-of-arrays (SoA) representation (columnar object layout). To eliminate intermediate data in global memory, Ikra merges cascaded parallel sections into one kernel using symbolic execution.

References

[1]

Daniel Abadi, Peter A. Boncz, Stavros Harizopoulos, Stratos Idreos, and Samuel Madden. The design and implementation of modern column-oriented database systems. Foundations and Trends in Databases, 5(3):197–280, 2013.

Digital Library

[2]

Daniel Abadi, Samuel Madden, and Miguel Ferreira. Integrating compression and execution in column-oriented database systems. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD ’06, pages 671–682, New York, NY, USA, 2006. ACM.

Digital Library

[3]

Martin Abadi, Luca Cardelli, Benjamin Pierce, and Gordon Plotkin. Dynamic typing in a statically-typed language. In Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’89, pages 213–227, New York, NY, USA, 1989. ACM.

Digital Library

[4]

James Abel, Kumar Balasubramanian, Mike Bargeron, Tom Craver, and Mike Phlipot. Applications tuning for streaming SIMD extensions. Intel Technology Journal, (Q2):13, May 1999.

[5]

Peter Bakkum and Kevin Skadron. Accelerating SQL database operations on a GPU with CUDA. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU- 3, pages 94–103, New York, NY, USA, 2010. ACM.

Digital Library

[6]

Edward Corwin and Antonette Logar. Sorting in linear time - variations on the bucket sort. J. Comput. Sci. Coll., 20(1):197–202, October 2004.

Digital Library

[7]

Wu-chun Feng and Shucai Xiao. To GPU synchronize or not GPU synchronize? In International Symposium on Circuits and Systems (ISCAS 2010), pages 3801–3804. IEEE, 2010.

[8]

Steffen Frey, Guido Reina, and Thomas Ertl. SIMT microscheduling: Reducing thread stalling in divergent iterative algorithms. In 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2012, pages 399–406, 2012.

Digital Library

[9]

Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. Database Systems: The Complete Book. Prentice Hall Press, Upper Saddle River, NJ, USA, 2 edition, 2008.

Digital Library

[10]

Tianyi David Han and Tarek S. Abdelrahman. Reducing branch divergence in GPU programs. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, pages 3:1–3:8, New York, NY, USA, 2011. ACM.

Digital Library

[11]

Dirk Helbing. Social Self-Organization: Agent-Based Simulations and Experiments to Study Emergent Social Behavior, chapter Agent-Based Modeling, pages 25–70. Springer Berlin Heidelberg, 2012.

[12]

Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, and Vivek Sarkar. Compiling and optimizing Java 8 programs for GPU execution. In 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015.

Digital Library

[13]

Glenn Krasner, editor. Smalltalk-80: Bits of History, Words of Advice. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1983.

Digital Library

[14]

Hidehiko Masuhara and Yusuke Nishiguchi. A data-parallel extension to ruby for GPGPU: Toward a framework for implementing domainspecific optimizations. In Proceedings of the 9th ECOOP Workshop on Reflection, AOP, and Meta-Data for Software Evolution, RAM-SE ’12, pages 3–6, New York, NY, USA, 2012. ACM.

Digital Library

[15]

Toni Mattis, Johannes Henning, Patrick Rein, Robert Hirschfeld, and Malte Appeltauer. Columnar objects: Improving the performance of analytical applications. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!), Onward! 2015, pages 197–210, New York, NY, USA, 2015. ACM.

Digital Library

[16]

Gang Mei and Hong Tian. Impact of data layouts on the efficiency of GPU-accelerated IDW interpolation. SpringerPlus, 5(1):1–18, 2016.

[17]

Nathaniel Nystrom, Derek White, and Kishen Das. Firepile: Runtime compilation for GPUs in Scala. In Proceedings of the 10th ACM International Conference on Generative Programming and Component Engineering, GPCE ’11, pages 107–116, New York, NY, USA, 2011. ACM.

Digital Library

[18]

Ritesh A. Patel, Yao Zhang, Jason Mak, and John D. Owens. Parallel lossless data compression on the GPU. In Proceedings of Innovative Parallel Computing (InPar ’12), May 2012.

[19]

Hasso Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09, pages 1–2, New York, NY, USA, 2009. ACM.

Digital Library

[20]

Piotr Przymus and Krzysztof Kaczmarski. On the Move to Meaningful Internet Systems 2012 Workshops: OTM Academy, Industry Case Studies Program, EI2N, INBAST, META4eS, OnToContent, ORM, SeDeS, SINCOM, and SOMOCO 2012.Proceedings, chapter Improving Efficiency of Data Intensive Applications on GPU Using Lightweight Compression, pages 3–12. Springer Berlin Heidelberg, 2012.

[21]

Koichi Sasada. YARV: Yet Another RubyVM: Innovating the ruby interpreter. In Companion to the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA ’05, pages 158–159, New York, NY, USA, 2005. ACM.

Digital Library

[22]

Vasily Volkov. Better performance at lower occupancy. Proceedings of the GPU Technology Conference, GTC, 10:16, 2010.

[23]

Mohamed Wahib and Naoya Maruyama. Scalable kernel fusion for memory-bound GPU applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’14, pages 191–202, Piscataway, NJ, USA, 2014.

Digital Library

[24]

IEEE Press.

[25]

Haicheng Wu, Gregory Diamos, Srihari Cadambi, and Sudhakar Yalamanchili. Kernel weaver: Automatically fusing database primitives for efficient GPU computation. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’12, pages 107–118, Washington, DC, USA, 2012. IEEE Computer Society.

Digital Library

[26]

Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, and Xipeng Shen. Streamlining GPU applications on the fly: Thread divergence elimination through runtime thread-data remapping. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS ’10, pages 115–126, New York, NY, USA, 2010. ACM.

Digital Library

Cited By

Zhang MAlawneh ARogers T(2021)Characterizing Massively Parallel Polymorphism2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS51385.2021.00037(205-216)Online publication date: Mar-2021
https://doi.org/10.1109/ISPASS51385.2021.00037
Gamaarachchi HFawsan MFasna FElkaduwe D(2017)User-friendly interface for GPGPU programming2017 6th National Conference on Technology and Management (NCTM)10.1109/NCTM.2017.7872835(99-104)Online publication date: Jan-2017
https://doi.org/10.1109/NCTM.2017.7872835

Index Terms

Object support in an array-based GPGPU extension for Ruby
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
    2. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

A data-parallel extension to Ruby for GPGPU: toward a framework for implementing domain-specific optimizations
RAM-SE '12: Proceedings of the 9th ECOOP Workshop on Reflection, AOP, and Meta-Data for Software Evolution

We propose Ikra, a data-parallel extension to Ruby for general-purpose computing on graphical processing unit (GPGPU). Our approach is to provide a special array class with higher-order methods for describing computation on a GPU. With a static type ...
Modular array-based GPU computing in a dynamically-typed language
ARRAY 2017: Proceedings of the 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

Nowadays, GPU accelerators are widely used in areas with large data-parallel computations such as scientific computations or neural networks. Programmers can either write code in low-level CUDA/OpenCL code or use a GPU extension for a high-level ...
A unified optimizing compiler framework for different GPGPU architectures

This article presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ARRAY 2016: Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming

June 2016

68 pages

ISBN:9781450343848

DOI:10.1145/2935323

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PLDI '16

Sponsor:

SIGSOFT

PLDI '16: ACM SIGPLAN Conference on Programming Language Design and Implementation

June 14, 2016

CA, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 17 of 25 submissions, 68%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
109
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang MAlawneh ARogers T(2021)Characterizing Massively Parallel Polymorphism2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS51385.2021.00037(205-216)Online publication date: Mar-2021
https://doi.org/10.1109/ISPASS51385.2021.00037
Gamaarachchi HFawsan MFasna FElkaduwe D(2017)User-friendly interface for GPGPU programming2017 6th National Conference on Technology and Management (NCTM)10.1109/NCTM.2017.7872835(99-104)Online publication date: Jan-2017
https://doi.org/10.1109/NCTM.2017.7872835

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten