research-article

AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters

Authors:
Tan N. Le

Stony Brook University,, Stony Brook, NY, USA

Stony Brook University,, Stony Brook, NY, USA
View Profile

,
Xiao Sun

Stony Brook University, Stony Brook, NY, USA

Stony Brook University, Stony Brook, NY, USA
View Profile

,
Mosharaf Chowdhury

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Zhenhua Liu

Stony Brook University, Stony Brook, NY, USA

Stony Brook University, Stony Brook, NY, USA
View Profile

ACM SIGMETRICS Performance Evaluation Review Volume 46 Issue 2September 2018pp 87–88https://doi.org/10.1145/3305218.3305251

Published:17 January 2019Publication History

ACM SIGMETRICS Performance Evaluation Review

Abstract

GPUs are considered as the accelerators for CPUs. We call these applications GPU applications. Some machine learning frameworks like Tensorflow support their machine learning (ML) jobs running either on CPUs or GPUs. Nvidia claims that Titan GPU K80 12GB can speed up 5-10x on average. Although GPUs offer the advantages on performance, they are very expensive. For example, a GPU K80 roughly costs $4000 while an Intel Xeon E5 Quad cores costs $350.

The coexist of traditional CPU and GPU applications urges cloud computing operators to build hybrid CPU/GPU clusters. While the traditional applications are executed on CPUs, the GPU applications can run on either CPUs or GPUs. In the CPU/GPU clusters, how to provision the hybrid CPU/GPU clusters for CPU and GPU applications and how to allocate the resources across CPUs and GPUs?

Interchangeable resources like CPUs and GPUs are not rare in large clusters. Some network I/O cards like wireless, ethernet, infinityband with different bandwidths can also be interchangeable.

In this paper, we focus on CPU/GPU systems. We develop a tool that estimates the performance and resource for an ML job in an online manner (§2). We implement AlloX system that supports resource allocation and places applications on right resources (CPU or GPU) to maximize the use of computational resource (§3). The proposed AlloX policy achieves up to 35% progress improvement compared to default DRF [2]. We build a model that minimizes the total cost of ownership for CPU/GPU data centers (§4).

References

O. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, and M. Zhang. Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics. In NSDI, volume 2, pages 4--2, 2017. Google ScholarDigital Library
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In NSDI, 2011. Google ScholarDigital Library
T. N. Le, Z. Liu, Y. Chen, and C. Bash. Joint capacity planning and operational management for sustainable data centers and demand response. In Proceedings of the Seventh International Conference on Future Energy Systems, page 16. ACM, 2016. Google ScholarDigital Library
S. Venkataraman, Z. Yang, M. J. Franklin, B. Recht, and I. Stoica. Ernest: Efficient performance prediction for large-scale advanced analytics. In NSDI, pages 363--378, 2016. Google ScholarDigital Library

Index Terms

AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters
1. Networks
  1. Network performance evaluation

Index terms have been assigned to the content through auto-classification.

Recommendations

AlloX: compute allocation in hybrid clusters
EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems

Modern deep learning frameworks support a variety of hardware, including CPU, GPU, and other accelerators, to perform computation. In this paper, we study how to schedule jobs over such interchangeable resources - each with a different rate of ...
Read More
Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and Simulation

High performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
Read More
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMETRICS Performance Evaluation Review Volume 46, Issue 2
September 2018
95 pages
ISSN:0163-5999
DOI:10.1145/3305218
Editor:
Nidhi Hegde
Borealis AI
Issue’s Table of Contents
Copyright © 2019 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 January 2019
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 268
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters

ACM SIGMETRICS Performance Evaluation Review

Abstract

References

Cited By

Index Terms

Recommendations

AlloX: compute allocation in hybrid clusters

Evaluation of Rodinia Codes on Intel Xeon Phi

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters

ACM SIGMETRICS Performance Evaluation Review

Abstract

References

Cited By

Index Terms

Recommendations

AlloX: compute allocation in hybrid clusters

Evaluation of Rodinia Codes on Intel Xeon Phi

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media