CUDAGA: A Portable Parallel Programming Model for GPU Cluster

Chen, Yong; Jin, Hai; Xu, Dechao; Zheng, Ran; Liu, Haocheng; Zeng, Jingxiang

doi:10.1007/978-3-319-28430-9_16

Yong Chen^16,17,
Hai Jin¹⁶,
Dechao Xu¹⁷,
Ran Zheng¹⁶,
Haocheng Liu¹⁶ &
…
Jingxiang Zeng¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9106))

Included in the following conference series:

Second International Conference on Cloud Computing and Big Data in Asia

1331 Accesses

Abstract

GPU cluster is important for high performance computing with its high performance/cost ratio. However, it is still very hard for application developers to write parallel codes on GPU. MPI is mostly used for parallel programming, and data locality and communication must be specified explicitly by developers. Moreover, data transmission between CPU and GPU must also be processed with CUDA codes. CUDAGA, a new parallel programming model for GPU cluster with CUDA, is presented to provide portable interfaces for commu-nication on GPUs. GA (Global Arrays), a portable shared-memory programming model for distributed memory computers, is the base to facilitate parallel pro-gramming and maintain transparent global arrays on GPUs. Experiments show that CUDAGA can decrease parallel programming difficulties, but ensures better performance for some specific applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Buck, I., Foley, T., Horn, D., Sugerman, J.: Brook for GPUs: stream computing on graphics hardware. ACM Trans. Graph. 23(3), 777–786 (2004)
Article Google Scholar
Volodymyr, V., Jeremy, J. E., Guochun, S.: GPU clusters for high-performance computing. In: Proceedings of IEEE Cluster PPAC Workshop, pp. 1–8. IEEE Computer Society (2009)
Google Scholar
Hawick, K.A., Leist, A., Playne, D.P.: Regular lattice and small-world spin model simulations using CUDA and GPUs. Int. J. Parallel Prog. 39(2), 183–201 (2011)
Article Google Scholar
Nieplocha, J., Harrison, R.J., Littlefield, R.J.: Global arrays: a non-uniform memory access programming model for high-performance computers. J. Supercomput. 10(2), 169–189 (1996)
Article Google Scholar
Nieplocha, J., Carpenter, B.: ARMCI: a portable remote memory copy library for distributed array libraries and compiler run-time systems. In: Rolim, J., et al. (eds.) IPPS-WS 1999 and SPDP-WS 1999. LNCS, vol. 1586, pp. 533–546. Springer, Heidelberg (1999)
Chapter Google Scholar
Micikevicius, P.: 3D finite difference computation on GPUs using CUDA. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pp. 79–84. ACM, New York (2009)
Google Scholar
William, G.N.D., Lusk, E., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 22(6), 789–828 (1996)
Article MATH Google Scholar
Orion, S.L.: Message passing for GPGPU clusters: cudaMPI. In: Proceedings of IEEE International Conference on Cluster Computing and Workshops, pp. 1–8. IEEE (2009)
Google Scholar
Moerschell, A., Owens, J.D.: Distributed texture memory in a multi-GPU environment. In: Graphics Hardware, pp. 31–38 (2006)
Google Scholar
Nieplocha, J., Harrison, R.J., Littlefield, R.J.: The global array programming model for high performance scientific computing. SIAM News 28(7), 12–14 (1995)
Google Scholar
Fan, Z., Qiu, F., Kaufman, A.: Zippy: a framework for computation and visualization on a GPU clusters. Comput. Graph. Forum 27(2), 341–350 (2008)
Article Google Scholar
Strengert, M., Müller, C., Dachsbacher, C., Ertl, T.: CUDASA: compute unified device and systems architecture. In: Proceedings of Eurographics Symposium on Parallel Graphics and Visualization (EGPGV 2008), pp. 49–56. Eurographics Association (2008)
Google Scholar
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)
Article Google Scholar

Download references

Acknowledgment

This work is supported by the National 973 Key Basic Research Plan of China (No. 2013CB2282036), the Major Subject of the State Grid Corporation of China (No. SGCC-MPLG001(001-031)-2012), the National 863 Basic Research Program of China (No. 2011AA05A118), the National Natural Science Foundation of China (No. 61133008), the National Science and Technology Pillar Program (No. 2012BAH14F02) and the independent innovation project of Huazhong University of Science and Technology.

Author information

Authors and Affiliations

Services Computing Technology and System Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Yong Chen, Hai Jin, Ran Zheng, Haocheng Liu & Jingxiang Zeng
China Electric Power Research Institute, Beijing, 100192, China
Yong Chen & Dechao Xu

Authors

Yong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hai Jin
View author publications
You can also search for this author in PubMed Google Scholar
Dechao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ran Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Haocheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jingxiang Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ran Zheng .

Editor information

Editors and Affiliations

School of Computer Science and Tech., Huazhong Univ. of Science and Technology, Wuhan, China
Weizhong Qiang
College of Mathematics and Computer Sci., Fuzhou University, Fuzhou, China
Xianghan Zheng
Dept. of Computer Scie and Informat. Eng, Chung Hua University, Hsinchu, Taiwan
Ching-Hsien Hsu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Jin, H., Xu, D., Zheng, R., Liu, H., Zeng, J. (2015). CUDAGA: A Portable Parallel Programming Model for GPU Cluster. In: Qiang, W., Zheng, X., Hsu, CH. (eds) Cloud Computing and Big Data. CloudCom-Asia 2015. Lecture Notes in Computer Science(), vol 9106. Springer, Cham. https://doi.org/10.1007/978-3-319-28430-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-28430-9_16
Published: 10 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28429-3
Online ISBN: 978-3-319-28430-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics