ABSTRACT
Machine Learning (ML) is becoming critical for many industrial and scientific endeavors, and has a growing presence in High Performance Computing (HPC) environments. Neural network training requires long execution times for large data sets, and libraries like TensorFlow implement GPU acceleration to reduce the total runtime for each calculation. This tutorial demonstrates how to 1) use Chameleon Cloud to perform comparative studies of ML training performance across different hardware configurations; and 2) run and monitor power utilization of TensorFlow on NVIDIA GPUs.
Index Terms
- Machine Learning GPU Power Measurement on Chameleon Cloud
Recommendations
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers
Highlights- Generate parallel CUDA code from sequential C input code using a compiler-based tool for key operators in Geometric Multigrid.
AbstractGPUs, with their high bandwidths and computational capabilities are an increasingly popular target for scientific computing. Unfortunately, to date, harnessing the power of the GPU has required use of a GPU-specific programming model ...
Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing
CLUSTER '10: Proceedings of the 2010 IEEE International Conference on Cluster ComputingIn this paper, we describe our experiment developing an implementation of the Linpack benchmark for TianHe-1, a petascale CPU/GPU supercomputer system, the largest GPU-accelerated system ever attempted before. An adaptive optimization framework is ...
Comments