Abstract
GPU architectures tend to be increasingly important in multi-core era nowadays due to their formidable computational horsepower. With the assistant of effective programming paradigms as CUDA, GPUs are widely adopted to accelerate scientific applications. Meanwhile, the surging energy consumption by GPUs becomes a major challenge to both GPU architects and programmers. In addition to the efforts designing energy efficient GPU architecture, comprehensive understanding on how programming affects the energy consumption of GPU application is also indispensable from the programmer perspective.
In this paper, we present a programming-oriented PTX instruction level energy model to provide programmers the ability of predicting the energy consumption of their program. Distinct from previous models which require hardware performance counters or architectural simulations, our model relies on the PTX instruction of a CUDA program which is not only portable but also accurate. With the selected PTX instructions based on empirical study, we apply linear regression to build the GPU energy model. One appealing advantage of our model is that it does not require any instrumentation or profiling of the GPU application during execution. Actually, our model is able to advise the programmers step by step to illustrate how their way of programming impacts the final energy consumption, especially at the stage of hacking the codes. Our model is evaluated on NVIDIA GeForce GTX 470 with Rodinia benchmark suites. The results show the accuracy of our model is promising with average prediction error below 3.7%. With the help of our GPU energy model, the programmers are gaining valuable insights to improve the energy efficiency of the application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: Gpu computing. Proceedings of the IEEE 96(5), 879–899 (2008)
Kirk, D.: Nvidia cuda software and gpu parallel computing architecture. In: ISMM, vol. 7, pp. 103–104 (2007)
Hsu, C.-H., Feng, W.-C.: A power-aware run-time system for high-performance computing. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 1. IEEE Computer Society (2005)
Hong, S., Kim, H.: An integrated gpu power and performance model. In: ISCA 2010. ACM (2010)
Chen, J., Li, B., Zhang, Y., Peng, L., Peir, J.-K.: Tree structured analysis on gpu power study. In: ICCD 2011. IEEE (2011)
Nagasaka, H., Maruyama, N., Nukada, A., Endo, T., Matsuoka, S.: Statistical power modeling of gpu kernels using performance counters. In: Green Computing Conference (2010)
NVIDIA Compute. Ptx: Parallel thread execution isa version 2.3, 1 (2010), Dostopno na: http://developer.download.nvidia.com/compute/cuda/3
NVIDIA Compute. CUDA Compiler Driver NVCC (2013)
Chen, J., Li, B., Zhang, Y., Peng, L., Peir, J.-k.: Statistical gpu power analysis using tree-based methods. In: 2011 International Green Computing Conference and Workshops (IGCC), pp. 1–6. IEEE (2011)
Ma, X., Dong, M., Zhong, L., Deng, Z.: Statistical power consumption analysis and modeling for gpu-based computing. In: Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems, HotPower (2009)
Luo, C., Suda, R.: A performance and energy consumption analytical model for gpu. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), pp. 658–665. IEEE (2011)
Ma, K., Li, X., Chen, W., Zhang, C., Wang, X.: Greengpu: A holistic approach to energy efficiency in gpu-cpu heterogeneous architectures. In: 2012 41st International Conference on Parallel Processing (ICPP), pp. 48–57. IEEE (2012)
NVIDIA Compute. Using Inline PTX Assembly in CUDA (2013)
Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. ACM SIGARCH Computer Architecture News 37, 152–163 (2009)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.-H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54. IEEE (2009)
Collange, S., Defour, D., Tisserand, A.: Power consumption of GPUs from a software perspective. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part I. LNCS, vol. 5544, pp. 914–923. Springer, Heidelberg (2009)
Pool, J., Lastra, A., Singh, M.: An energy model for graphics processing units. In: 2010 IEEE International Conference on Computer Design (ICCD), pp. 409–416. IEEE (2010)
Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., Sarrafzadeh, M.: Energy-aware high performance computing with graphic processing units. In: Workshop on Power Aware Computing and System (2008)
Huang, S., Xiao, S., Feng, W.-c.: On the energy efficiency of graphics processing units for scientific computing. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–8. IEEE (2009)
Tiwari, V., Malik, S., Wolfe, A., Lee, M.T.-C.: Instruction level power analysis and optimization of software. In: Technologies for Wireless Computing, pp. 139–154. Springer (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, Q., Yang, H., Luan, Z., Qian, D. (2013). POIGEM: A Programming-Oriented Instruction Level GPU Energy Model for CUDA Program. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-03859-9_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)