poster

A microbenchmark to study GPU performance models

Author:

Vasily VolkovAuthors Info & Claims

PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Pages 421 - 422

https://doi.org/10.1145/3178487.3178536

Published: 10 February 2018 Publication History

Get Access

Abstract

Basic microarchitectural features of NVIDIA GPUs have been stable for a decade, and many analytic solutions were proposed to model their performance. We present a way to review, systematize, and evaluate these approaches by using a microbenchmark. In this manner, we produce a brief algebraic summary of key elements of selected performance models, identify patterns in their design, and highlight their previously unknown limitations. Also, we identify a potentially superior method for estimating performance based on classical work.

References

[1]

Denning, P. J., and Buzen, J. P. 1978. The Operational Analysis of Queuing Network Models. ACM Computing Surveys 10, 3, 225--261.

Digital Library

Google Scholar

[2]

Hong, S., and Kim, H. 2009. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In International Symposium on Computer Architecture (ISCA '09), 152--163.

Digital Library

Google Scholar

[3]

Chen, X. E., and Aamodt, T. M. 2009. A first-order fine-grained multithreaded throughput model. In International Symposium on High Performance Computer Architecture (HPCA '09), 329--340.

Google Scholar

[4]

Huang, J.-C., Lee, J. H., Kim, H., and Lee, H.-H. S. 2014. GPUMech: GPU performance modeling technique based on interval analysis. In International Symposium on Microarchitecture (MICRO-47), 268--279.

Digital Library

Google Scholar

[5]

Sim, J., Dasgupta, A., Kim, H., and Vuduc, R. 2012. A performance analysis framework for identifying potential benefits in GPGPU applications. In Symposium on Principles and Practice of Parallel Programming (PPoPP '12), 11--22.

Digital Library

Google Scholar

[6]

Zhang, Y., and Owens, J. D. 2011. A quantitative performance analysis model for GPU architectures. In International Symposium on High Performance Computer Architecture (HPCA '11), 382--393.

Digital Library

Google Scholar

[7]

Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D., and Hwu, W. W. 2010. An adaptive performance modeling tool for GPU architectures. In Symposium on Principles and Practice of Parallel Programming (PPoPP '10), 105--114.

Digital Library

Google Scholar

[8]

NVIDIA. 2017. CUDA C Programming Guide v9.1. November 2017.

Google Scholar

[9]

Saavedra-Barrera, R., Culler, D., and von Eicken, T. 1990. Analysis of multithreaded architectures for parallel computing. In Symposium on Parallel Algorithms and Architectures (SPAA '90), 169--178.

Digital Library

Google Scholar

Cited By

View all

Lemeire JCornelis JKonstantinidis E(2023)Analysis of the analytical performance models for GPUs and extracting the underlying Pipeline modelJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.11.002173(32-47)Online publication date: Mar-2023
https://doi.org/10.1016/j.jpdc.2022.11.002
Yin MXu XZhang TYe C(2021)Performance Evaluation Model for Matrix Calculation on GPUInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142154030635:15Online publication date: 15-Oct-2021
https://doi.org/10.1142/S0218001421540306
Belhadi ADjenouri YLin J(2019)Comparative Study on Trajectory Outlier Detection Algorithms2019 International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW.2019.00067(415-423)Online publication date: Nov-2019
https://doi.org/10.1109/ICDMW.2019.00067
Show More Cited By

Recommendations

A microbenchmark to study GPU performance models
PPoPP '18

Basic microarchitectural features of NVIDIA GPUs have been stable for a decade, and many analytic solutions were proposed to model their performance. We present a way to review, systematize, and evaluate these approaches by using a microbenchmark. In ...
Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories
Benchmarking, Measuring, and Optimizing
Abstract
Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. The Roofline Scaling Trajectories technique aims at diagnosing various performance bottlenecks for GPU programming models through the visually ...
High-performance Cholesky factorization for GPU-only execution
GPGPU-10: Proceedings of the General Purpose GPUs

We present our performance analysis, algorithm designs, and the optimizations needed for the development of high-performance GPU-only algorithms, and in particular, for the dense Cholesky factorization. In contrast to currently promoted designs that ...

Comments

Information & Contributors

Information

Published In

PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 2018

442 pages

ISBN:9781450349826

DOI:10.1145/3178487

General Chair:
Andreas Krall
Vienna University of Technology, Austria
,
Program Chair:
Thomas R. Gross
ETH Zürich, Switzerland

ACM SIGPLAN Notices Volume 53, Issue 1
PPoPP '18
January 2018
426 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3200691
Editor:
Matthew Fluet
Rodchester Institude of Technology
Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 February 2018

Check for updates

Qualifiers

Poster

Conference

PPoPP '18

Sponsor:

PPoPP '18: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 24 - 28, 2018

Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
795
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)4

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Lemeire JCornelis JKonstantinidis E(2023)Analysis of the analytical performance models for GPUs and extracting the underlying Pipeline modelJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.11.002173(32-47)Online publication date: Mar-2023
https://doi.org/10.1016/j.jpdc.2022.11.002
Yin MXu XZhang TYe C(2021)Performance Evaluation Model for Matrix Calculation on GPUInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142154030635:15Online publication date: 15-Oct-2021
https://doi.org/10.1142/S0218001421540306
Belhadi ADjenouri YLin J(2019)Comparative Study on Trajectory Outlier Detection Algorithms2019 International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW.2019.00067(415-423)Online publication date: Nov-2019
https://doi.org/10.1109/ICDMW.2019.00067
Arafa YBadawy AChennupati GSanthi NEidenbenz S(2019)Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916466(1-8)Online publication date: Sep-2019
https://doi.org/10.1109/HPEC.2019.8916466
Wang YChai YZhang Q(2022)WALOR: Workload-Driven Adaptive Layout Optimization of Raft Groups for Heterogeneous Distributed Key-Value StoresNetwork and Parallel Computing10.1007/978-3-031-21395-3_27(290-301)Online publication date: 24-Sep-2022
https://dl.acm.org/doi/10.1007/978-3-031-21395-3_27

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations