short-paper

A refactoring tool to extract GPU kernels

Authors:
Kostadin Damevski

Virginia State University, Petersburg, VA, USA

Virginia State University, Petersburg, VA, USA
View Profile

,
Madhan Muralimanohar

Virginia State University, Petersburg, VA, USA

Virginia State University, Petersburg, VA, USA
View Profile

Authors Info & Claims

WRT '11: Proceedings of the 4th Workshop on Refactoring ToolsMay 2011Pages 29–32https://doi.org/10.1145/1984732.1984739

Published:22 May 2011Publication History

WRT '11: Proceedings of the 4th Workshop on Refactoring Tools

Pages 29–32

ABSTRACT

Significant performance gains can be achieved by using hardware architectures that integrate GPUs with conventional CPUs to form a hybrid and highly parallel computational engine. However, programming these novel architectures is tedious and error prone, reducing their ease of acceptance in an even wider range of computationally intensive applications. In this paper we discuss a refactoring technique, called Extract Kernel that transforms a loop written in C into a parallel function that uses NVIDIA's CUDA framework to execute on a GPU. The selected approach and the challenges encountered are described, as well as some early results that demonstrate the potential of this refactoring.

References

Jeffrey C. Carver, Richard P. Kendall, Susan E. Squires, and Douglass E. Post. Software development environments for scientific and engineering software: A series of case studies. In ICSE '07: Proceedings of the 29th International Conference on Software Engineering, pages 550--559, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
Danny Dig, Mihai Tarce, Cosmin Radoi, Marius Minea, and Ralph Johnson. Relooper: refactoring for loop parallelism in java. In Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications, OOPSLA '09, 2009. Google ScholarDigital Library
Eclipse Parallel Tools Platform (PTP). http://www.eclipse.org/ptp, accessed January 2011.Google Scholar
Stuart Faulk, Eugene Loh, Michael L. Van De Vanter, Susan Squires, and Lawrence G. Votta. Scientific computing's productivity gridlock: How software engineering can help. Computing in Science and Engineering, 11:30--39, 2009. Google ScholarDigital Library
Fredrik Kjolstad, Danny Dig, and Marc Snir. Bringing the hpc programmer's ide into the 21st century through refactoring. In SPLASH 2010 Workshop on Concurrency for the Application Programmer, October 2010.Google Scholar
David M. Kunzman and Laxmikant V. Kalé. Towards a framework for abstracting accelerators in parallel applications: experience with cell. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, 2009. Google ScholarDigital Library
Shih-Wei Liao, Amer Diwan, Robert P. Bosch, Jr., Anwar Ghuloum, and Monica S. Lam. SUIF Explorer: an interactive and interprocedural parallelizer. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), 1999. Google ScholarDigital Library
F. H. McMahon. Livermore fortran kernels: A computer test of numerical performance range. Technical Report UCRL-53745, Lawrence Livermore National Laboratory, Livermore, CA, December 1986.Google Scholar
NVIDIA's Thrust GPU Library. http://code.google.com/p/thrust, accessed March 2011.Google Scholar
OpenCL - The open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl, accessed January 2011.Google Scholar
S. Squires, M. Van De Vanter, and L. Votta. Yes, there is an 'expertise gap' in hpc application development. In Proceedings of the 3rd International Workshop on Productivity and Performance in High-End Computing (PPHEC). IEEE CS Press, 2006.Google Scholar
Top 500 Supercomputing Sites. http://www.top500.org, accessed January 2011.Google Scholar
Mingliang Wang and M. Parashar. Object-oriented stream programming using aspects. In 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), April 2010.Google ScholarCross Ref

Index Terms

A refactoring tool to extract GPU kernels
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management
        Software maintenance
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues

Recommendations

Out-of-core implementation for accelerator kernels on heterogeneous clouds

Cloud environments today are increasingly featuring hybrid nodes containing multicore CPU processors and a diverse mix of accelerators such as Graphics Processing Units (GPUs), Intel Xeon Phi co-processors, and Field-Programmable Gate Arrays (FPGAs) to ...
Read More
Architecture-Aware Mapping and Optimization on a 1600-Core GPU
ICPADS '11: Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed Systems

The graphics processing unit (GPU) continues to make in-roads as a computational accelerator for high-performance computing (HPC). However, despite its increasing popularity, mapping and optimizing GPU code remains a difficult task, it is a multi-...
Read More
SIMD Monte-Carlo Numerical Simulations Accelerated on GPU and Xeon Phi

The efficiency of a pleasingly parallel application is studied for several computing platforms. A real world problem, i.e., Monte-Carlo numerical simulations of stratospheric balloon envelope drift descent is considered. We detail the optimization of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WRT '11: Proceedings of the 4th Workshop on Refactoring Tools
May 2011
52 pages
ISBN:9781450305792
DOI:10.1145/1984732
Program Chairs:
Danny Dig
University of Illinois at Urbana-Champaign, USA
,
Don Batory
University of Texas at Austin, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 May 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cuda
gpu
refactoring
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate9of9submissions,100%
Upcoming Conference
SPLASH '24

Sponsor:

sigplan

ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity

October 20 - 25, 2024

Pasadena , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 232
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A refactoring tool to extract GPU kernels

WRT '11: Proceedings of the 4th Workshop on Refactoring Tools

ABSTRACT

References

Cited By

Index Terms

Recommendations

Out-of-core implementation for accelerator kernels on heterogeneous clouds

Architecture-Aware Mapping and Optimization on a 1600-Core GPU

SIMD Monte-Carlo Numerical Simulations Accelerated on GPU and Xeon Phi