skip to main content
10.1145/1984732.1984739acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
short-paper

A refactoring tool to extract GPU kernels

Published:22 May 2011Publication History

ABSTRACT

Significant performance gains can be achieved by using hardware architectures that integrate GPUs with conventional CPUs to form a hybrid and highly parallel computational engine. However, programming these novel architectures is tedious and error prone, reducing their ease of acceptance in an even wider range of computationally intensive applications. In this paper we discuss a refactoring technique, called Extract Kernel that transforms a loop written in C into a parallel function that uses NVIDIA's CUDA framework to execute on a GPU. The selected approach and the challenges encountered are described, as well as some early results that demonstrate the potential of this refactoring.

References

  1. Jeffrey C. Carver, Richard P. Kendall, Susan E. Squires, and Douglass E. Post. Software development environments for scientific and engineering software: A series of case studies. In ICSE '07: Proceedings of the 29th International Conference on Software Engineering, pages 550--559, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Danny Dig, Mihai Tarce, Cosmin Radoi, Marius Minea, and Ralph Johnson. Relooper: refactoring for loop parallelism in java. In Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications, OOPSLA '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eclipse Parallel Tools Platform (PTP). http://www.eclipse.org/ptp, accessed January 2011.Google ScholarGoogle Scholar
  4. Stuart Faulk, Eugene Loh, Michael L. Van De Vanter, Susan Squires, and Lawrence G. Votta. Scientific computing's productivity gridlock: How software engineering can help. Computing in Science and Engineering, 11:30--39, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fredrik Kjolstad, Danny Dig, and Marc Snir. Bringing the hpc programmer's ide into the 21st century through refactoring. In SPLASH 2010 Workshop on Concurrency for the Application Programmer, October 2010.Google ScholarGoogle Scholar
  6. David M. Kunzman and Laxmikant V. Kalé. Towards a framework for abstracting accelerators in parallel applications: experience with cell. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shih-Wei Liao, Amer Diwan, Robert P. Bosch, Jr., Anwar Ghuloum, and Monica S. Lam. SUIF Explorer: an interactive and interprocedural parallelizer. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. H. McMahon. Livermore fortran kernels: A computer test of numerical performance range. Technical Report UCRL-53745, Lawrence Livermore National Laboratory, Livermore, CA, December 1986.Google ScholarGoogle Scholar
  9. NVIDIA's Thrust GPU Library. http://code.google.com/p/thrust, accessed March 2011.Google ScholarGoogle Scholar
  10. OpenCL - The open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl, accessed January 2011.Google ScholarGoogle Scholar
  11. S. Squires, M. Van De Vanter, and L. Votta. Yes, there is an 'expertise gap' in hpc application development. In Proceedings of the 3rd International Workshop on Productivity and Performance in High-End Computing (PPHEC). IEEE CS Press, 2006.Google ScholarGoogle Scholar
  12. Top 500 Supercomputing Sites. http://www.top500.org, accessed January 2011.Google ScholarGoogle Scholar
  13. Mingliang Wang and M. Parashar. Object-oriented stream programming using aspects. In 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), April 2010.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A refactoring tool to extract GPU kernels

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WRT '11: Proceedings of the 4th Workshop on Refactoring Tools
        May 2011
        52 pages
        ISBN:9781450305792
        DOI:10.1145/1984732
        • Program Chairs:
        • Danny Dig,
        • Don Batory

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 May 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate9of9submissions,100%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader