Abstract
GPUs are considered as the accelerators for CPUs. We call these applications GPU applications. Some machine learning frameworks like Tensorflow support their machine learning (ML) jobs running either on CPUs or GPUs. Nvidia claims that Titan GPU K80 12GB can speed up 5-10x on average. Although GPUs offer the advantages on performance, they are very expensive. For example, a GPU K80 roughly costs $4000 while an Intel Xeon E5 Quad cores costs $350.
The coexist of traditional CPU and GPU applications urges cloud computing operators to build hybrid CPU/GPU clusters. While the traditional applications are executed on CPUs, the GPU applications can run on either CPUs or GPUs. In the CPU/GPU clusters, how to provision the hybrid CPU/GPU clusters for CPU and GPU applications and how to allocate the resources across CPUs and GPUs?
Interchangeable resources like CPUs and GPUs are not rare in large clusters. Some network I/O cards like wireless, ethernet, infinityband with different bandwidths can also be interchangeable.
In this paper, we focus on CPU/GPU systems. We develop a tool that estimates the performance and resource for an ML job in an online manner (§2). We implement AlloX system that supports resource allocation and places applications on right resources (CPU or GPU) to maximize the use of computational resource (§3). The proposed AlloX policy achieves up to 35% progress improvement compared to default DRF [2]. We build a model that minimizes the total cost of ownership for CPU/GPU data centers (§4).
- O. Alipourfard, H. H. Liu, J. Chen, S. Venkataraman, M. Yu, and M. Zhang. Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics. In NSDI, volume 2, pages 4--2, 2017. Google ScholarDigital Library
- A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In NSDI, 2011. Google ScholarDigital Library
- T. N. Le, Z. Liu, Y. Chen, and C. Bash. Joint capacity planning and operational management for sustainable data centers and demand response. In Proceedings of the Seventh International Conference on Future Energy Systems, page 16. ACM, 2016. Google ScholarDigital Library
- S. Venkataraman, Z. Yang, M. J. Franklin, B. Recht, and I. Stoica. Ernest: Efficient performance prediction for large-scale advanced analytics. In NSDI, pages 363--378, 2016. Google ScholarDigital Library
Index Terms
- AlloX: Allocation across Computing Resources for Hybrid CPU/GPU clusters
Recommendations
AlloX: compute allocation in hybrid clusters
EuroSys '20: Proceedings of the Fifteenth European Conference on Computer SystemsModern deep learning frameworks support a variety of hardware, including CPU, GPU, and other accelerators, to perform computation. In this paper, we study how to schedule jobs over such interchangeable resources - each with a different rate of ...
Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and SimulationHigh performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Comments