skip to main content
10.1145/3076113.3076119acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Profiling a GPU database implementation: a holistic view of GPU resource utilization on TPC-H queries

Published:14 May 2017Publication History

ABSTRACT

General Purpose computing on Graphics Processing Units (GPGPU) has become an increasingly popular option for accelerating database queries. However, GPUs are not well-suited for all types of queries as data transfer costs can often dominate query execution. We develop a methodology for quantifying how well databases utilize GPU architectures using proprietary profiling tools. By aggregating various profiling metrics, we break down the different aspects that comprise occupancy on the GPU across the runtime of query execution. We show that for the Alenka GPU database, only a small minority of execution time, roughly 5% is spent on the GPU. We further show that even on queries with seemingly good performance, a large portion of the achieved occupancy can actually be attributed to stalls and scalar instructions.

References

  1. Nvprof, command line profiling tool. http://docs.nvidia.com/cuda/profiler-users-guide/.Google ScholarGoogle Scholar
  2. TPC-H, Benchmark Specification. https://tpc.org/tpch/.Google ScholarGoogle Scholar
  3. Alenka - A GPU Database Engine. https://github.com/antonmks/Alenka/, 2012--20017.Google ScholarGoogle Scholar
  4. Bress, S., Heimel, M., Siegmund, N., Bellatreche, L., and Saake, G. Gpu-accelerated database systems: Survey and open challenges. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XV. Springer, 2014, pp. 1--35.Google ScholarGoogle Scholar
  5. Coutinho, B. R., Teodoro, G. L. M., Oliveira, R. S., Neto, D. O. G., and Ferreira, R. A. C. Profiling general purpose gpu applications. In Computer Architecture and High Performance Computing, 2009. SBAC-PAD'09. 21st International Symposium on (2009), IEEE, pp. 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gregg, C., and Hazelwood, K. Where is the data? why you cannot debate cpu vs. gpu performance without the answer. In Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium on (2011), IEEE, pp. 134--144.Google ScholarGoogle ScholarCross RefCross Ref
  7. He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N. K., Luo, Q., and Sander, P. V. Relational query coprocessing on graphics processors. ACM Transactions on Database Systems (TODS) 34, 4 (2009), 21.Google ScholarGoogle Scholar
  8. He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., and Sander, P. Relational joins on graphics processors. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (2008), ACM, pp. 511--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hong, S., and Kim, H. An integrated gpu power and performance model. In ACM SIGARCH Computer Architecture News (2010), vol. 38, ACM, pp. 280--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mostak, T. An overview of mapd (massively parallel database). White paper. Massachusetts Institute of Technology (2013).Google ScholarGoogle Scholar
  11. Sim, J., Dasgupta, A., Kim, H., and Vuduc, R. A performance analysis framework for identifying potential benefits in gpgpu applications. In ACM SIGPLAN Notices (2012), vol. 47, ACM, pp. 11--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Team, A. D. T. Codexl quick start guide. https://github.com/GPUOpen-Tools/CodeXL/releases/download/v2.0/CodeXL_Quick_Start_Guide.pdf.Google ScholarGoogle Scholar
  13. Vesely, J., Basu, A., Oskin, M., Loh, G. H., and Bhattacharjee, A. Observations and opportunities in architecting shared virtual memory for heterogeneous systems. In Performance Analysis of Systems and Software (ISPASS), 2016 IEEE International Symposium on (2016), IEEE, pp. 161--171. Google ScholarGoogle ScholarCross RefCross Ref
  14. Yuan, Y., Lee, R., and Zhang, X. The yin and yang of processing data ware-housing queries on gpu devices. Proceedings of the VLDB Endowment 6, 10 (2013), 817--828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhang, S., He, J., He, B., and Lu, M. Omnidb: Towards portable and efficient query processing on parallel cpu/gpu architectures. Proceedings of the VLDB Endowment 6, 12 (2013), 1374--1377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zhang, Y., and Owens, J. D. A quantitative performance analysis model for gpu architectures. In High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on (2011), IEEE, pp. 382--393. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Profiling a GPU database implementation: a holistic view of GPU resource utilization on TPC-H queries

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        DAMON '17: Proceedings of the 13th International Workshop on Data Management on New Hardware
        May 2017
        70 pages
        ISBN:9781450350259
        DOI:10.1145/3076113

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 May 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate80of102submissions,78%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader