Loading [a11y]/accessibility-menu.js
Accelerating HPCG on Tianhe-2: A hybrid CPU-MIC algorithm | IEEE Conference Publication | IEEE Xplore

Accelerating HPCG on Tianhe-2: A hybrid CPU-MIC algorithm


Abstract:

In this paper, we propose a hybrid algorithm to enable and accelerate the High Performance Conjugate Gradient (HPCG) benchmark on a heterogeneous node with an arbitrary n...Show More

Abstract:

In this paper, we propose a hybrid algorithm to enable and accelerate the High Performance Conjugate Gradient (HPCG) benchmark on a heterogeneous node with an arbitrary number of accelerators. In the hybrid algorithm, each subdomain is assigned to a node after a three-dimensional domain decomposition. The subdomain is further divided to several regular inner blocks and an outer part with a flexible inner-outer partitioning strategy. Each inner task is assigned to a MIC device and the size is adjustable to adapt the accelerator's computational power. The only outer part is assigned to CPU and the thickness of boundary size is also adjustable to maintain load balance between CPU and MICs. By properly fusing the computational kernels with preceding ones, we present an asynchronous data transfer scheme to better overlap local computation with the PCI-express data transfer. All basic HPCG kernels, especially the time-consuming sparse matrix-vector multiplication (SpMV) and the symmetric Gauss-Seidel relaxation (SymGS), are extensively optimized for both CPU and MIC, on both algorithmic and architectural levels. On a single node of Tianhe-2 which is composed of an Intel Xeon processor and three Intel Xeon Phi coprocessors, we successfully obtain an aggregated performance of 50.2 Gflops, which is around 1.5% of the peak performance.
Date of Conference: 16-19 December 2014
Date Added to IEEE Xplore: 30 April 2015
Electronic ISBN:978-1-4799-7615-7
Print ISSN: 1521-9097
Conference Location: Hsinchu, Taiwan

Contact IEEE to Subscribe

References

References is not available for this document.