skip to main content
10.1145/3599957.3606208acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

Improve the Performance of Parallel Reduction on General-Purpose Graphics Processor Units Using Prediction Models

Authors Info & Claims
Published:29 August 2023Publication History

ABSTRACT

When executing a kernel function on a general-purpose graphics processing unit (GPGPU), it is critical to select an appropriate configuration setting for optimal performance. Configuration settings affect the allocation and utilization of GPGPU resources during the execution of a kernel function1. However, testing all possible configuration settings to find an optimal setting is time-consuming and costly. To address this challenge, we propose a prediction mechanism that can suggest a configuration setting for the kernel function to complete the operation with minimal execution time. We start by filtering the amount of data, mandatory parameters, and optional parameters, and then calculate the resource occupancy of three critical resources on the GPGPU: Warp, Register, and Shared Memory. We eliminate configuration settings with a lower average resource occupancy than the user-defined value. The remaining configuration settings have better execution performance, and we use them to execute the kernel functions and record the required execution time. Finally, we use these configuration settings and their corresponding execution times as training data to build a prediction model using the logistic regression (LR) algorithm. At runtime, the prediction model recommends a configuration setting with better performance when the amount of data to be processed is known. We have conducted experiments that confirm our proposed mechanism's ability to improve kernel function execution performance more effectively than other mechanisms. Note that the proposed mechanism can be applied to other kernel functions.

References

  1. CUDA Toolkit Documentation v11.3.0, https://docs.nvidia.com/cuda/index.html, 2021.Google ScholarGoogle Scholar
  2. Miroslav Kubat, An Introduction to Machine Learning, Springer, 2017, pp. 43--62.Google ScholarGoogle ScholarCross RefCross Ref
  3. Thanasekhar Balaiah and Ranjani Parthasarathi. 2020. Autotuning of configuration for program execution in GPUs. Concurrency and Computation: Practice and Experience 32, 9 (2020), e5635.Google ScholarGoogle ScholarCross RefCross Ref
  4. Yalin Baştardar and Mustafa Özuysal. 2014. Introduction to machine learning. miRNomics: MicroRNA biology and computational analysis (2014), 105--128.Google ScholarGoogle Scholar
  5. Ben van Werkhoven. 2019. Kernel Tuner: A search-optimizing GPU code auto- tuner. Future Generation Computer Systems 90 (2019), 347--358.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Improve the Performance of Parallel Reduction on General-Purpose Graphics Processor Units Using Prediction Models

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems
      August 2023
      251 pages
      ISBN:9798400702280
      DOI:10.1145/3599957

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate393of1,581submissions,25%
    • Article Metrics

      • Downloads (Last 12 months)13
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader