skip to main content
10.1145/3218603.3218626acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

Load-Triggered Warp Approximation on GPU

Published:23 July 2018Publication History

ABSTRACT

Value similarity of operands across warps have been exploited to improve energy efficiency of GPUs. Prior work, however, incurs significant overheads to check value similarity for every instruction and does not improve performance as it does not reduce the number of executed instructions. This work proposes Lock 'n Load (LnL) which triggers approximate execution of code regions by only checking similarity of values returned from load instructions and fuses multiple approximated warps into a single warp.

References

  1. M. Samadi, J. Lee, D. A. Jamshidi, A. Hormati, and S. Mahlke. SAGE: Self-Tuning Approximation for Graphics Engines. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Wong, N. S. Kim, and M. Annavaram. Approximating Warps with Intra-warp Operand Value Similarity. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2016.Google ScholarGoogle Scholar
  3. S. Z. Gilani, N. S. Kim, and M.J. Schulte. Power-efficient Computing for Compute-intensive GPGPU Applications. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Yilmazer, Z. Chen, and D. Kaeli. Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs. In IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Lee, K. Kim, G. Koo, H. Jeon, W. W. Ro, and M. Annavaram. Warped-Compression: Enabling Power Efficient GPUs through Register Compression. In IEEE/ACM International Symposium on Computer Architecture (ISCA), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Liu, S. Gilani, M. Annavaram, and N. S. Kim. G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017.Google ScholarGoogle Scholar
  7. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. In IEEE International Symposium on Workload Characterization (IISWC), pages 44--54, Oct 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. S. Kim, T. M. Aamodt, and V.J. Reddi. GPUWattch: Enabling Energy Optimizations in GPGPUs. In IEEE/ACM International Symposium on Computer Architecture (ISCA), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Jeon, G.S. Ravi, N.S. Kim, and M. Annavaram. GPU Register File Virtualization. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing CUDA Workloads Using a Detailed GPU Simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2009.Google ScholarGoogle ScholarCross RefCross Ref
  11. G. Ziegler. Textures and Surfaces. URL http://on-demand.gputechconf.com/gtc-express/2011/presentations/texture_webinar_aug_2011.pdf.Google ScholarGoogle Scholar
  12. NVIDIA. Fermi Architecture Whitepaper. URL http://www.nvidia.com/content/pdf/fermi_white_papers/nvidia_fermi_compute_architecture_whitepaper.pdf.Google ScholarGoogle Scholar
  13. NVIDIA. CUDA Toolkit 4.0. URL https://developer.nvidia.com/cuda-toolkit-40.Google ScholarGoogle Scholar

Index Terms

  1. Load-Triggered Warp Approximation on GPU

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design
      July 2018
      327 pages
      ISBN:9781450357043
      DOI:10.1145/3218603

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 July 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate398of1,159submissions,34%

      Upcoming Conference

      ISLPED '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader