research-article

Improving CUDA™ C/C++ encoding readability to foster parallel application development

Authors:
Bruno F.L. Santos

Federal University of Sergipe Computer Department, Brazil

Federal University of Sergipe Computer Department, Brazil
View Profile

,
Hendrik T. Macedo

Federal University of Sergipe Computer Department, Brazil

Federal University of Sergipe Computer Department, Brazil
View Profile

Authors Info & Claims

ACM SIGSOFT Software Engineering Notes Volume 37 Issue 1January 2012pp 1–5https://doi.org/10.1145/2088883.2088897

Published:27 January 2012Publication History

ACM SIGSOFT Software Engineering Notes

Abstract

Graphical Processing Units (GPUs) have recently been used to enable parallel application development. The most prominent initiative has been provided by NVIDIA™ with the so-called CUDA™ architecture, designed to GeForce™ graphic cards. However, even with CUDA C-like programming language, parallel codification remains somewhat awkward if compared to sequential codification. The programmer still has to deal with low-level hardware details such as generation and synchronization of threads and GPU tracks and sectors. In this paper, we propose a programmer-friendly interface for CUDA-C programming, in such a way that most hardware details are hidden from the programmer. We show how code readability is improved without undermining parallel execution performance.

References

Ioannis E. Venetis, Theodore S. Papatheodorou, Guang R. Gao. 2006. Handling Massive Parallelism Efficiently: Introducing Batches of Thread.Google Scholar
David Culler, J.P. Singh and Anoop Gupta. 1998. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann; 1st edition. Google ScholarDigital Library
N. M. Amato, R. Iyer, S. Sundaresan, Y. Wu. 1998. A Comparison of Parallel Sorting Algorithms on Different Architectures.Google Scholar
Raymond Greenlaw, H. James Hoover and Walter L. Ruzzo. 1995. Limits to Parallel Computation: P-Completeness Theory. Google ScholarDigital Library
David Geer. 2005. Industry Trends: Chip Makers Turn to Multicore Processors. Computer, vol. 38, no. 5, pp. 11--13, May 2005. Google ScholarDigital Library
2006. The Technical Impact of Moore's Law. IEEE solid-state circuits society newsletter.Google Scholar
OpenMP ARB. 2011. OpenMP.org. Available at: http://www.openmp.org/ (Accessed: 25 October 2011).Google Scholar
David Luebke and Greg Humphreys. 2007. How GPUs Work. IEEE Computer. Google ScholarDigital Library
NVIDIA Corporation. 2011. CUDA. Available at: http://www.nvidia.com/cuda. (Accessed: 25 October 2011).Google Scholar
NVIDIA Corporation. 2011. GeForce 8400. Available at: http://www.nvidia.com/object/geforce_8400.html. (Accessed: 25 October 2011).Google Scholar
NVIDIA Corporation. 2011. CUDA Programming Guide Version 4.0.Google Scholar
NVIDIA Corporation. 2011. CUDA Reference Manual Version 4.0.Google Scholar
Udi Manber. 1989. Introduction to Algorithms: A Creative Approach. Addison-Wesley; 1 edition Google ScholarDigital Library
Cheer-Sun D. Yang, Amie L. Souter, and Lori L. Pollock. 1998. All-du-path coverage for parallel programs. SIGSOFT Softw. Eng. Notes 23, 2 (March 1998), 153-162. DOI=10.1145/271775.271804 http://doi.acm.org/10.1145/271775.271804 Google ScholarDigital Library
Chia-Chu Chiang. 2005. Implicit heterogeneous and parallel programming. SIGSOFT Softw. Eng. Notes 30, 3 (May 2005), 1-6. DOI=10.1145/1061874.1061887 http://doi.acm.org/10.1145/1061874.1061887 Google ScholarDigital Library

Index Terms

Improving CUDA™ C/C++ encoding readability to foster parallel application development
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software creation and management
    1. Designing software
      1. Software implementation planning
        Software design techniques
    2. Software development process management
  2. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
Read More
Hybrid Parallel Programming on GPU Clusters
ISPA '10: Proceedings of the International Symposium on Parallel and Distributed Processing with Applications

Nowadays, NVIDIA’s CUDA is a general purpose scalable parallel programming model for writing highly parallel applications. It provides several key abstractions – a hierarchy of thread blocks, shared memory, and barrier synchronization. This model has ...
Read More
A Parallel Implementation of the 2D Wavelet Transform Using CUDA
PDP '09: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing

There is a multicore platform that is currently concentrating an enormous attention due to its tremendous potential in terms of sustained performance: the NVIDIA Tesla boards. These cards intended for general-purpose computing on graphic processing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGSOFT Software Engineering Notes Volume 37, Issue 1
January 2012
115 pages
ISSN:0163-5948
DOI:10.1145/2088883
Issue’s Table of Contents

Copyright © 2012 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 January 2012
Check for updates
Author Tags
GPU
Nvidia CUDATM-C
application programming interface
parallel programming
programming language readability
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 144
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving CUDA™ C/C++ encoding readability to foster parallel application development

ACM SIGSOFT Software Engineering Notes

Abstract

References

Cited By

Index Terms

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Hybrid Parallel Programming on GPU Clusters

A Parallel Implementation of the 2D Wavelet Transform Using CUDA

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Improving CUDA™ C/C++ encoding readability to foster parallel application development

ACM SIGSOFT Software Engineering Notes

Abstract

References

Cited By

Index Terms

Recommendations

A performance study of general-purpose applications on graphics processors using CUDA

Hybrid Parallel Programming on GPU Clusters

A Parallel Implementation of the 2D Wavelet Transform Using CUDA

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media