skip to main content
10.1145/3178487.3178536acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

A microbenchmark to study GPU performance models

Published: 10 February 2018 Publication History

Abstract

Basic microarchitectural features of NVIDIA GPUs have been stable for a decade, and many analytic solutions were proposed to model their performance. We present a way to review, systematize, and evaluate these approaches by using a microbenchmark. In this manner, we produce a brief algebraic summary of key elements of selected performance models, identify patterns in their design, and highlight their previously unknown limitations. Also, we identify a potentially superior method for estimating performance based on classical work.

References

[1]
Denning, P. J., and Buzen, J. P. 1978. The Operational Analysis of Queuing Network Models. ACM Computing Surveys 10, 3, 225--261.
[2]
Hong, S., and Kim, H. 2009. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In International Symposium on Computer Architecture (ISCA '09), 152--163.
[3]
Chen, X. E., and Aamodt, T. M. 2009. A first-order fine-grained multithreaded throughput model. In International Symposium on High Performance Computer Architecture (HPCA '09), 329--340.
[4]
Huang, J.-C., Lee, J. H., Kim, H., and Lee, H.-H. S. 2014. GPUMech: GPU performance modeling technique based on interval analysis. In International Symposium on Microarchitecture (MICRO-47), 268--279.
[5]
Sim, J., Dasgupta, A., Kim, H., and Vuduc, R. 2012. A performance analysis framework for identifying potential benefits in GPGPU applications. In Symposium on Principles and Practice of Parallel Programming (PPoPP '12), 11--22.
[6]
Zhang, Y., and Owens, J. D. 2011. A quantitative performance analysis model for GPU architectures. In International Symposium on High Performance Computer Architecture (HPCA '11), 382--393.
[7]
Baghsorkhi, S. S., Delahaye, M., Patel, S. J., Gropp, W. D., and Hwu, W. W. 2010. An adaptive performance modeling tool for GPU architectures. In Symposium on Principles and Practice of Parallel Programming (PPoPP '10), 105--114.
[8]
NVIDIA. 2017. CUDA C Programming Guide v9.1. November 2017.
[9]
Saavedra-Barrera, R., Culler, D., and von Eicken, T. 1990. Analysis of multithreaded architectures for parallel computing. In Symposium on Parallel Algorithms and Architectures (SPAA '90), 169--178.

Cited By

View all
  • (2023)Analysis of the analytical performance models for GPUs and extracting the underlying Pipeline modelJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.11.002173(32-47)Online publication date: Mar-2023
  • (2021)Performance Evaluation Model for Matrix Calculation on GPUInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142154030635:15Online publication date: 15-Oct-2021
  • (2019)Comparative Study on Trajectory Outlier Detection Algorithms2019 International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW.2019.00067(415-423)Online publication date: Nov-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2018
442 pages
ISBN:9781450349826
DOI:10.1145/3178487
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 53, Issue 1
    PPoPP '18
    January 2018
    426 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3200691
    Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 February 2018

Check for updates

Qualifiers

  • Poster

Conference

PPoPP '18

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)4
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Analysis of the analytical performance models for GPUs and extracting the underlying Pipeline modelJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.11.002173(32-47)Online publication date: Mar-2023
  • (2021)Performance Evaluation Model for Matrix Calculation on GPUInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142154030635:15Online publication date: 15-Oct-2021
  • (2019)Comparative Study on Trajectory Outlier Detection Algorithms2019 International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW.2019.00067(415-423)Online publication date: Nov-2019
  • (2019)Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916466(1-8)Online publication date: Sep-2019
  • (2022)WALOR: Workload-Driven Adaptive Layout Optimization of Raft Groups for Heterogeneous Distributed Key-Value StoresNetwork and Parallel Computing10.1007/978-3-031-21395-3_27(290-301)Online publication date: 24-Sep-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media