Abstract
The key to achieving high performance on a GPU-enhanced cluster is efficient exploitation of each GPU’s powerful computing capability. Moreover, rationally balancing the workload between CPUs and GPUs can release additional computing power, which arises from the CPUs. In this paper, we extend our earlier work on using a hybrid CPU-GPU cluster for real-world sedimentary basin simulation, by further improving the involved CUDA implementations. A thorough analysis of the achieved new performance is also carried out. By using 1024 GPUs and 12288 CPU cores together, our best CPU-GPU hybrid implementation is able to achieve a double-precision performance of 72.8 TFlops, in connection with simulations on a huge 131072×131072 mesh.
Similar content being viewed by others
References
Nickolls, J., Dally, W.J.: The GPU computing era. IEEE MICRO 30(2), 56–59 (2011)
Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)
Thibault, J.C., Senocak, I.: CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proceedings of the 47th AIAA Aerospace Sciences Meeting (2009)
Hamada, T., Nitadori, K.: 190 TFlops astrophysical N-body simulation on a cluster of GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)
Hampton, S.S., Alam, S.R., Crozier, P.S., Agarwal, P.K.: Optimal utilization of heterogeneous resources for biomolecular simulations. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)
Shimokawabe, T., Aoki, T., Takaki, T., Yamanaka, A., Nukada, A., Endo, T., Maruyama, N., Matsuoka, S.: Peta-scale Phase-Field Simulation for Dendritic Solidification on the TSUBAME 2.0 Supercomputer. In: Proceedings of the 2011 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2011)
Wen, M., Su, H., Wei, W., Wu, N., Cai, X., Zhang, C.: Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations. In: Proceedings of the IEEE Cluster, pp. 25–35. IEEE Computer Society, Los Alamitos (2012)
Clark, S.R., Wei, W., Cai, X.: Numerical analysis of a dual-sediment transport model applied to Lake Okeechobee, Florida. In: Proceedings of the 9th International Symposium on Parallel and Distributed Computing, pp. 189–194. IEEE Computer Society Press, Los Alamitos (2010)
Jordan, T.E., Flemmings, P.B.: Large-scale stratigraphic architecture, eustatic variation, and unsteady tectonism: a theorectical evaluation. J. Geophys. Res. 96(1), 6681–6699 (1991)
Rivenæs, J.C.: A computer simulation model for siliclastic basin stratigraphy. Ph.D. thesis, University of Trondheim (1993)
Wei, W., Clark, S.R., Su, H., Wen, M., Cai, X.: Balancing efficiency and accuracy for sediment transport simulations (2012). http://heim.ifi.uio.no/xingca/Wei-etal-2012-CG.pdf
Schottler, S.P., Engstrom, D.R.: A chronological assessment of Lake Okeechobee (Florida) sediments using multiple dating markers. J. Paleolimnol. 36, 19–36 (2006)
Reddy, K.R., Diaz, O.A., Scinto, L.J., Agami, M.: Phosphorus dynamics in selected wetlands and streams of the lake Okeechobee Basin. J. Ecol. Eng. 5, 183–207 (1995)
Hill, G.W., DeWitt, N.T., Hansen, M.: Lake O-keechobee bathymetry data. Tech. rep. (2002) http://sofia.usgs.gov/publications/maps/lakeokeebathy/index.html
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. 14(3), 189–204 (2000)
Acknowledgements
The authors gratefully acknowledge the support from the National Natural Science Foundation of China under NSFC Nos. 61033008, 61103080 and 61272145, SRFDP No. 20104307110002 and 20124307130004, Innovation in Graduate School of NUDT No. B120605, Hunan Provincial Innovation Foundation for Postgraduate under No. CX2012B030, FriNatek program of the Research Council of Norway No. 214113/F20. Technical assistance from the National Supercomputing Center in Changsha is also acknowledged.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wen, M., Su, H., Wei, W. et al. High efficient sedimentary basin simulations on hybrid CPU-GPU clusters. Cluster Comput 17, 359–369 (2014). https://doi.org/10.1007/s10586-013-0300-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-013-0300-9