research-article

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Authors:

Vignesh T. Ravi,

Michela Becchi,

Srimat ChakradharAuthors Info & Claims

HPDC '11: Proceedings of the 20th international symposium on High performance distributed computing

Pages 217 - 228

https://doi.org/10.1145/1996130.1996160

Published: 08 June 2011 Publication History

Abstract

Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed.

This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs. Our contributions are twofold: we extend an open source GPU virtualization software to include efficient GPU sharing, and we propose solutions to the conceptual problem of GPU kernel consolidation. In particular, we introduce a method for computing the affinity score between two or more kernels, which provides an indication of potential performance improvements upon kernel consolidation. In addition, we explore molding as a means to achieve efficient GPU sharing also in the case of kernels with high or conflicting resource requirements. We use these concepts to develop an algorithm to efficiently map a set of kernels on a pair of GPUs. We extensively evaluate our framework using eight popular GPU kernels and two Fermi GPUs. We find that even when contention is high our consolidation algorithm is effective in improving the throughput, and that the runtime overhead of our framework is low.

References

[1]

S. Baghsorkhi, M. Lathara, and W. mei Hwu. CUDA-lite: Reducing GPU Programming Complexity. In LCPC 2008, 2008.

[2]

M. Becchi\textit, et al. Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory. In SPAA '10, pages 82--91, New York, NY, USA, 2010. ACM.

Digital Library

[3]

W. Cirne and F. Berman. Using moldability to improve the performance of supercomputer jobs. JPDC, 62(10):1571--1601, 2002.

Digital Library

[4]

E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good. The cost of doing science on the cloud: the montage example. In SC'08, pages 1--12, Piscataway, NJ, USA, 2008. IEEE Press.

Digital Library

[5]

Y. Diao, N. Gandhi, J. Hellerstein, S. Parekh, and D. Tilbury. Mimo control of an apache web server: Modeling and controller design. In ACC02, pages 4922--4927, May 2002.

[6]

J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí.rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In HPCS, pages 224--231, 2010.

[7]

D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. In JSSPP, pages 1--16, 2004.

Digital Library

[8]

G. Giunta, R. Montella, G. Agrillo, and G. Coviello. A GPGPU transparent virtualization component for high performance computing clouds. In Euro-Par 2010 - Parallel Processing, volume 6271 of Lecture Notes in Computer Science, chapter 37, pages 379--391--391. Springer Berlin / Heidelberg, Berlin, Heidelberg, 2010.

Digital Library

[9]

M. Guevara, C. Gregg, K. Hazelwood, and K. Skadron. Enabling task parallelism in the CUDA scheduler. In PMEA 2009, pages 69--76, Raleigh, NC, USA, 2009. ACM.

[10]

V. Gupta, A. Gavrilovska, K. Schwan, H. Kharche, N. Tolia, V. Talwar, and P. Ranganathan. GViM: GPU-accelerated virtual machines. In HPCVirt '09, pages 17--24, New York, NY, USA, 2009. ACM.

Digital Library

[11]

J. Heo, X. Zhu, P. Padala, and Z. Wang. Memory overbooking and dynamic control of xen virtual machines in consolidated environments. In IM09, pages 630--637, June 2009.

Digital Library

[12]

A. HORI, H. TEZUKA, and Y. ISHIKAWA. Highly efficient gang scheduling implementation. SC Conference, page 43, 1998.

Digital Library

[13]

D. Kondo, B. Javadi, P. Malecot, F. Cappello, and D. P. Anderson. Cost-benefit analysis of cloud computing versus desktop grids. In IPDPS '09, pages 1--12, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

[14]

S. Lee and R. Eigenmann. OpenMPC: Extended OpenMP Programming and Tuning for GPUs. In SC, Nov 2010.

Digital Library

[15]

D. Li, S. Byna, and S. Chakradhar. Energy aware workload consolidation on GPU. Technical Report TR-11-01, Virginia Tech. Computer Science Dept., 2010.

[16]

H. Lim, S. Babu, J. Chase, and S. Parekh. Automated control in cloud computing: Challenges and opportunities. In ACDC09, pages 13--18, June 2009.

Digital Library

[17]

J. Li, et al. escience in the cloud: A modis satellite data reprojection and reduction pipeline in the windows azure platform. In IPDPS '10, Washington, DC, USA, 2010. IEEE Computer Society.

[18]

C.-K. Luk, S. Hong, and H. Kim. Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Micro '09, pages 45--55, New York, NY, USA, 2009. ACM.

Digital Library

[19]

P. Marshall, K. Keahey, and T. Freeman. Elastic Site: Using Clouds to Elastically Extend Site Resources. In CCGRID, May 2010.

Digital Library

[20]

P.Padala, et al. Automated control of multiple virtualized resources. In Eurosys09, pages 13--26, 2009.

Digital Library

[21]

V. T. Ravi, W. Ma, D. Chiu, and G. Agrawal. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In ICS '10, pages 137--146, New York, NY, USA, 2010. ACM.

Digital Library

[22]

G. Sabin, V. Sahasrabudhe, and P. Sadayappan. Assessment and enhancement of meta-schedulers for multi-site job sharing. In HPDC, pages 144--153, 2005.

Digital Library

[23]

L. Shi, H. Chen, and J. Sun. vcuda: Gpu accelerated high performance computing in virtual machines. IPDPS '09, 0:1--11, 2009.

[24]

E. Shmueli and D. G. Feitelson. On simulation and design of parallel-systems schedulers: Are we doing the right thing? TPDS, 20(7):983--996, 2009.

Digital Library

[25]

S.M.Park and M.Humphrey. Feedback-controlled resource sharing for predictable escience. In HPCN '08, Nov. 2008.

Digital Library

[26]

S. Srinivasan, S. Krishnamoorthy, and P. Sadayappan. A robust scheduling strategy for moldable scheduling of parallel jobs. IEEE Cluster, 0:92, 2003.

[27]

S. Srinivasan, V. Subramani, R. Kettimuthu, P. Holenarsipur, and P. Sadayappan. Effective selection of partition sizes for moldable scheduling of parallel jobs. In HiPC, pages 174--183, 2002.

Digital Library

[28]

D. Talby and D. G. Feitelson. Improving and stabilizing parallel computer performance using adaptive backfilling. In IPDPS, 2005.

Digital Library

[29]

D. Tarditi, S. Puri, and J. Oglesby. Accelerator: Using Data Parallelism to Program GPUs for General-purpose Uses. In ASPLOS '06, pages 325--335, New York, NY, USA, 2006. ACM.

Digital Library

[30]

G. Teodoro, R. S. Oliveira, O. Sertel, M. N. Gurcan, W. M. Jr., Ü. Çatalyürek, and R. Ferreira. Coordinating the use of GPU and CPU for improving performance of compute intensive applications. In CLUSTER, pages 1--10, 2009.

[31]

D. Tsafrir, Y. Etsion, and D. G. Feitelson. Backfilling using system-generated predictions rather than user runtime estimates. TPDS, 18(6):789--803, 2007.

Digital Library

[32]

C. Vecchiola, S. Pandey, and R. Buyya. High-performance cloud computing: A view of scientific applications. ISPAN, 0:4--16, 2009.\endthebibliography

Digital Library

Cited By

Hè HFriedman MRekatsinas T(2024)EnergAt: Fine-Grained Energy Attribution for Multi-TenancyACM SIGEnergy Energy Informatics Review10.1145/3698365.36983694:3(18-25)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3698365.3698369
Cai ZChen ZMa RGuan H(2024)SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU SharingIEEE Journal on Selected Areas in Communications10.1109/JSAC.2023.334540142:3(799-811)Online publication date: Mar-2024
https://doi.org/10.1109/JSAC.2023.3345401
Yang WZhang CPan MChandra SBlincoe KTonella P(2023)Understanding the Topics and Challenges of GPU Programming by Classifying and Analyzing Stack Overflow PostsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616365(1444-1456)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616365
Show More Cited By

Index Terms

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Multiprocessing / multiprogramming / multitasking

Recommendations

Virtualizing General Purpose GPUs for High Performance Cloud Computing: An Application to a Fluid Simulator
ISPA '12: Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications

In this work we present an hypervisor-independent GPU Virtualization Service named GVirtus. It instantiates virtual machines able to access to the GPU in a transparent way. GPUs allow to speed up calculations over CPUs. Therefore, virtualizing GPUs is a ...
Accelerated high-performance computing through efficient multi-process GPU resource sharing
CF '12: Proceedings of the 9th conference on Computing Frontiers

The HPC field is witnessing a widespread adoption of GPUs as accelerators for traditional homogeneous HPC systems. One of the prevalent parallel programming models is the SPMD paradigm, which has been adapted for GPU-based parallel processing. Since ...
GPU virtualization for high performance general purpose computing on the ESX hypervisor
HPC '14: Proceedings of the High Performance Computing Symposium

Graphics Processing Units (GPU) have become important components in high performance computing (HPC) systems for their massively parallel computing capability and energy efficiency. Virtualization technologies are increasingly applied to HPC to reduce ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HPDC '11: Proceedings of the 20th international symposium on High performance distributed computing

June 2011

296 pages

ISBN:9781450305525

DOI:10.1145/1996130

General Chair:
Arthur "Barney" Maccabe
Oak Ridge National Lab, USA
,
Program Chair:
Douglas Thain
University of Notre Dame, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

University of Arizona: University of Arizona
SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

HPDC '11

Sponsor:

University of Arizona
SIGARCH

HPDC '11: The 20th International Symposium on High-Performance Parallel and Distributed Computing

June 8 - 11, 2011

California, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

109
Total Citations
View Citations
1,916
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hè HFriedman MRekatsinas T(2024)EnergAt: Fine-Grained Energy Attribution for Multi-TenancyACM SIGEnergy Energy Informatics Review10.1145/3698365.36983694:3(18-25)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3698365.3698369
Cai ZChen ZMa RGuan H(2024)SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU SharingIEEE Journal on Selected Areas in Communications10.1109/JSAC.2023.334540142:3(799-811)Online publication date: Mar-2024
https://doi.org/10.1109/JSAC.2023.3345401
Yang WZhang CPan MChandra SBlincoe KTonella P(2023)Understanding the Topics and Challenges of GPU Programming by Classifying and Analyzing Stack Overflow PostsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616365(1444-1456)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616365
Fingler HZhu ZYoon EJia ZWitchel ERossbach C(2023)Disaggregated GPU Acceleration for Serverless ApplicationsACM SIGOPS Operating Systems Review10.1145/3606557.360656057:1(10-20)Online publication date: 28-Jun-2023
https://dl.acm.org/doi/10.1145/3606557.3606560
Hè HFriedman MRekatsinas TChien AEilam TPorter GAnderson TJosephson CPark J(2023)EnergAt: Fine-Grained Energy Attribution for Multi-TenancyProceedings of the 2nd Workshop on Sustainable Computer Systems10.1145/3604930.3605716(1-8)Online publication date: 9-Jul-2023
https://dl.acm.org/doi/10.1145/3604930.3605716
Chen HLin EChou YChou J(2023)Gemini: Enabling Multi-Tenant GPU Sharing Based on Kernel Burst EstimationIEEE Transactions on Cloud Computing10.1109/TCC.2021.311920511:1(854-867)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TCC.2021.3119205
Tan XGolikov PVijaykumar NPekhimenko GKloeckner AMoreira J(2022)GPUPoolProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569650(317-332)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1145/3559009.3569650
Sun PWu HJin JJiang ZGong Y(2022)TCUDA: A QoS-based GPU Sharing Framework for Autonomous Navigation Systems2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD55451.2022.00011(1-10)Online publication date: Nov-2022
https://doi.org/10.1109/SBAC-PAD55451.2022.00011
Fingler HZhu ZYoon EJia ZWitchel ERossbach C(2022)DGSF: Disaggregated GPUs for Serverless Functions2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00077(739-750)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00077
Stockinger MGuerine Mde Paula USantiago FFrota YRosseti IPlastino Ade Oliveira D(2022)A Provenance-based Execution Strategy for Variant GPU-accelerated Scientific Workflows in CloudsJournal of Grid Computing10.1007/s10723-022-09625-y20:4Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s10723-022-09625-y
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents