Low-overhead dynamic sharing of graphics memory space in GPU virtualization environments

Gu, Minwoo; Park, Younghun; Kim, Youngjae; Park, Sungyong

doi:10.1007/s10586-019-02967-5

Low-overhead dynamic sharing of graphics memory space in GPU virtualization environments

Published: 31 July 2019

Volume 23, pages 2167–2178, (2020)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Minwoo Gu¹,
Younghun Park¹,
Youngjae Kim¹ &
…
Sungyong Park ORCID: orcid.org/0000-0002-0309-1820¹

442 Accesses
Explore all metrics

Abstract

The proliferation of GPU intensive workloads has created a new challenge for low-overhead and efficient GPU virtualization solutions over GPU clouds. gVirt is a full GPU virtualization solution for Intel’s integrated GPUs that share system’s on-board memory for graphics memory. In order to solve the inherent scalability limitation on the number of simultaneous virtual machines (VM) in gVirt, gScale proposed a dynamic sharing scheme for global graphics memory among VMs by copying the entries in a private graphics translation table (GTT) to a physical GTT along with a GPU context switch. However, copying entries between private GTT and physical GTT often causes significant overhead, which becomes worse when the global graphics memory space shared by each VM is overlapped. This paper identifies that the copy overhead caused by GPU context switch is one of the major bottlenecks in performance improvement and proposes a low-overhead dynamic memory management scheme called DymGPU. DymGPU provides two memory allocation algorithms such as size-based and utilization-based algorithms. While the size-based algorithm allocates memory space based on the memory size required by each VM, the utilization-based algorithm considers GPU utilization of each VM to allocate memory space. DymGPU is also dynamic in the sense that the global graphics memory space used by each VM is rearranged at runtime by periodically checking idle VMs and GPU utilization of each runnable VM. We have implemented our proposed approach in gVirt and confirmed that the proposed scheme reduces GPU context switch time by up to 53% and improved the overall performance of various GPU applications by up to 39%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 10

Fig. 13

GPU-aware resource management in heterogeneous cloud data centers

Article 08 April 2021

Memory Optimization Paradigm for High Performance Energy Efficient GPU

A quantitative evaluation of unified memory in GPUs

Article 16 November 2019

References

Park, Y., Gu, M., Yoo, S., Kim, Y., Park, S.: DymGPU: Dynamic Memory Management for Sharing GPUs in Virtualized Clouds. In: 2018 IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS* W), pp. 51–57. IEEE (2018)
The compute architecture of Intel$\textregistered$ processor graphics Gen9.https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf
Pascal GPU architecture | NVIDIA.https://www.nvidia.com/en-us/data-center/pascal-gpu-architecture/
Duato, J., Pena, A.J., Silla, F., Mayo, R., Quintana-Ortí, E.S.: rCUDA: reducing the number of GPU-based accelerators in high performance clusters. In: 2010 International Conference on High Performance Computing and Simulation (HPCS), pp. 224–231. IEEE (2010)
Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU transparent virtualization component for high performance computing clouds. In: European Conference on Parallel Processing, pp. 379–391. Springer (2010)
Xiao, S., Balaji, P., Zhu, Q., Thakur, R., Coghlan, S., Lin, H., Wen, G., Hong, J., Feng, W.C.: VOCL: an optimized environment for transparent virtualization of graphics processing units. In: Innovative Parallel Computing (InPar), 2012, pp. 1–12. IEEE (2012)
Abramson, D., Jackson, J., Muthrasanallur, S., Neiger, G., Regnier, G., Sankaran, R., Schoinas, I., Uhlig, R., Vembu, B., Wiegert, J.: Intel virtualization technology for directed I/O. Intel Technol. J 10(3), 179–192 (2006)
Article Google Scholar
Tian, K., Dong, Y., Cowperthwaite, D.: A full GPU virtualization solution with mediated pass-through. In: USENIX Annual Technical Conference, pp. 121–132 (2014)
Suzuki, Y., Kato, S., Yamada, H., Kono, K.: GPUvm: why not virtualizing GPUs at the hypervisor? In: USENIX Annual Technical Conference, pp. 109–120 (2014)
Xue, M., Tian, K., Dong, Y., Ma, J., Wang, J., Qi, Z., He, B., Guan, H.: gScale: scaling up GPU virtualization with dynamic sharing of graphics memory space. In: USENIX Annual Technical Conference, pp. 579–590 (2016)
Kehne, J., Metter, J., Bellosa, F.: GPUswap: enabling oversubscription of GPU memory through transparent swapping. In: ACM SIGPLAN Notices, vol. 50, pp. 65–77. ACM (2015)
Kehne, J., Hillenbrand, M., Metter, J., Gottschlag, M., Merkel, M., Bellosa, F.: GPrioSwap: towards a swapping policy for GPUs. In: Proceedings of the 10th ACM International Systems and Storage Conference, p. 10. ACM (2017)
Xue, M., Ma, J., Li, W., Tian, K., Dong, Y., Wu, J., Qi, Z., He, B., Guan, H.: Scalable GPU virtualization with dynamic sharing of graphics memory space. IEEE Trans. Parallel Distrib. Syst. 1, 1–1 (2018)
Google Scholar
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: ACM SIGOPS Operating Systems Review, vol. 37, pp. 164–177. ACM (2003)
Intel$\textregistered$ GVT-g setup guide.https://github.com/intel/Igvtg-kernel/blob/2016q4-4.3.0/iGVT-g_Setup_Guide.txt
Valley benchmark | UNIGINE benchmarks.https://benchmark.unigine.com/valley
Superposition benchmark | UNIGINE benchmarks.https://benchmark.unigine.com/superposition
Phoronix Test Suite - linux testing & benchmarking platform, automated testing, open-source benchmarking.http://phoronix-test-suite.com/
cairographics.org.https://www.cairographics.org/
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, 2009. IISWC 2009. pp. 44–54. IEEE (2009)
Kato, S., McThrow, M., Maltzahn, C., Brandt, S.A.: Gdev: first-class GPU resource management in the operating system. In: USENIX Annual Technical Conference, pp. 401–412. Boston (2012)
Wang, K., Ding, X., Lee, R., Kato, S., Zhang, X.: GDM: device memory management for gpgpu computing. ACM SIGMETRICS Perform. Eval. Rev. 42(1), 533–545 (2014)
Article Google Scholar
Ji, F., Lin, H., Ma, X.: RSVM: a region-based software virtual memory for GPU. In: Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, pp. 269–278. IEEE Press (2013)
Becchi, M., Sajjapongse, K., Graves, I., Procter, A., Ravi, V., Chakradhar, S.: A virtual memory based runtime to support multi-tenancy in clusters with GPUs. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, pp. 97–108. ACM (2012)
Official GitHub repository of NVIDIA Docker.https://github.com/NVIDIA/nvidia-docker
Kang, D., Jun, T.J., Kim, D., Kim, J., Kim, D.: ConVGPU: GPU management middleware in container based virtualized environment. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 301–309. IEEE (2017)
Gu, J., Song, S., Li, Y., Luo, H.: GaiaGPU: sharing GPUs in container clouds. In: IEEE International Conference on Parallel & Distributed Processing with Applications (IEEE ISPA 2018), pp. 469–476. Melbourne, Australia (2018)

Download references

Acknowledgements

This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (2017M3C4A7080245).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Sogang University, 35, Baekbeom-ro, Mapo-gu, Seoul, Republic of Korea
Minwoo Gu, Younghun Park, Youngjae Kim & Sungyong Park

Authors

Minwoo Gu
View author publications
You can also search for this author in PubMed Google Scholar
Younghun Park
View author publications
You can also search for this author in PubMed Google Scholar
Youngjae Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sungyong Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sungyong Park.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gu, M., Park, Y., Kim, Y. et al. Low-overhead dynamic sharing of graphics memory space in GPU virtualization environments. Cluster Comput 23, 2167–2178 (2020). https://doi.org/10.1007/s10586-019-02967-5

Download citation

Received: 01 January 2019
Revised: 05 June 2019
Accepted: 23 July 2019
Published: 31 July 2019
Issue Date: September 2020
DOI: https://doi.org/10.1007/s10586-019-02967-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Low-overhead dynamic sharing of graphics memory space in GPU virtualization environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GPU-aware resource management in heterogeneous cloud data centers

Memory Optimization Paradigm for High Performance Energy Efficient GPU

A quantitative evaluation of unified memory in GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Low-overhead dynamic sharing of graphics memory space in GPU virtualization environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GPU-aware resource management in heterogeneous cloud data centers

Memory Optimization Paradigm for High Performance Energy Efficient GPU

A quantitative evaluation of unified memory in GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation