research-article

Load-Triggered Warp Approximation on GPU

Authors:
Zhenhong Liu

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

,
Daniel Wong

University of California, Riverside

University of California, Riverside
View Profile

,
Nam Sung Kim

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and DesignJuly 2018Article No.: 26Pages 1–6https://doi.org/10.1145/3218603.3218626

Published:23 July 2018Publication History

ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design

Pages 1–6

ABSTRACT

Value similarity of operands across warps have been exploited to improve energy efficiency of GPUs. Prior work, however, incurs significant overheads to check value similarity for every instruction and does not improve performance as it does not reduce the number of executed instructions. This work proposes Lock 'n Load (LnL) which triggers approximate execution of code regions by only checking similarity of values returned from load instructions and fuses multiple approximated warps into a single warp.

References

M. Samadi, J. Lee, D. A. Jamshidi, A. Hormati, and S. Mahlke. SAGE: Self-Tuning Approximation for Graphics Engines. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2013. Google ScholarDigital Library
D. Wong, N. S. Kim, and M. Annavaram. Approximating Warps with Intra-warp Operand Value Similarity. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2016.Google Scholar
S. Z. Gilani, N. S. Kim, and M.J. Schulte. Power-efficient Computing for Compute-intensive GPGPU Applications. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2013. Google ScholarDigital Library
A. Yilmazer, Z. Chen, and D. Kaeli. Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs. In IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2014. Google ScholarDigital Library
S. Lee, K. Kim, G. Koo, H. Jeon, W. W. Ro, and M. Annavaram. Warped-Compression: Enabling Power Efficient GPUs through Register Compression. In IEEE/ACM International Symposium on Computer Architecture (ISCA), 2015. Google ScholarDigital Library
Z. Liu, S. Gilani, M. Annavaram, and N. S. Kim. G-Scalar: Cost-Effective Generalized Scalar Execution Architecture for Power-Efficient GPUs. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017.Google Scholar
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. In IEEE International Symposium on Workload Characterization (IISWC), pages 44--54, Oct 2009. Google ScholarDigital Library
J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. S. Kim, T. M. Aamodt, and V.J. Reddi. GPUWattch: Enabling Energy Optimizations in GPGPUs. In IEEE/ACM International Symposium on Computer Architecture (ISCA), 2013. Google ScholarDigital Library
H. Jeon, G.S. Ravi, N.S. Kim, and M. Annavaram. GPU Register File Virtualization. In IEEE/ACM International Symposium on Microarchitecture (MICRO), 2015. Google ScholarDigital Library
A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing CUDA Workloads Using a Detailed GPU Simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2009.Google ScholarCross Ref
G. Ziegler. Textures and Surfaces. URL http://on-demand.gputechconf.com/gtc-express/2011/presentations/texture_webinar_aug_2011.pdf.Google Scholar
NVIDIA. Fermi Architecture Whitepaper. URL http://www.nvidia.com/content/pdf/fermi_white_papers/nvidia_fermi_compute_architecture_whitepaper.pdf.Google Scholar
NVIDIA. CUDA Toolkit 4.0. URL https://developer.nvidia.com/cuda-toolkit-40.Google Scholar

Index Terms

Load-Triggered Warp Approximation on GPU
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Single instruction, multiple data

Recommendations

Neural acceleration for GPU throughput processors
MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture

Graphics Processing Units (GPUs) can accelerate diverse classes of applications, such as recognition, gaming, data analytics, weather prediction, and multimedia. Many of these applications are amenable to approximate execution. This application ...
Read More
Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware

Recent advances in graphics processing units (GPUs) have resulted in massively parallel hardware that is easily programmable and widely available in today's desktop and notebook computer systems. GPUs typically use single-instruction, multiple-data (...
Read More
Taming warp divergence
CGO '17: Proceedings of the 2017 International Symposium on Code Generation and Optimization

Graphics Processing Units (GPU) are designed to run a large number of threads in parallel. These threads run on Streaming

Multiprocessors (SM) which consist of a few tens of SIMD cores. A kernel is launched on the GPU with an execution

...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design
July 2018
327 pages
ISBN:9781450357043
DOI:10.1145/3218603

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Approximate Computing
GPU
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate398of1,159submissions,34%
Upcoming Conference
ISLPED '24

Sponsor:

sigda

ACM/IEEE International Symposium on Low Power Electronics and Design

August 5 - 7, 2024

Newport Beach , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 184
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Load-Triggered Warp Approximation on GPU

ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

Neural acceleration for GPU throughput processors

Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware

Taming warp divergence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Load-Triggered Warp Approximation on GPU

ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

Neural acceleration for GPU throughput processors

Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware

Taming warp divergence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media