Skip to main content
Log in

High efficient sedimentary basin simulations on hybrid CPU-GPU clusters

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The key to achieving high performance on a GPU-enhanced cluster is efficient exploitation of each GPU’s powerful computing capability. Moreover, rationally balancing the workload between CPUs and GPUs can release additional computing power, which arises from the CPUs. In this paper, we extend our earlier work on using a hybrid CPU-GPU cluster for real-world sedimentary basin simulation, by further improving the involved CUDA implementations. A thorough analysis of the achieved new performance is also carried out. By using 1024 GPUs and 12288 CPU cores together, our best CPU-GPU hybrid implementation is able to achieve a double-precision performance of 72.8 TFlops, in connection with simulations on a huge 131072×131072 mesh.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Nickolls, J., Dally, W.J.: The GPU computing era. IEEE MICRO 30(2), 56–59 (2011)

    Article  Google Scholar 

  2. Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  3. Thibault, J.C., Senocak, I.: CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proceedings of the 47th AIAA Aerospace Sciences Meeting (2009)

    Google Scholar 

  4. Hamada, T., Nitadori, K.: 190 TFlops astrophysical N-body simulation on a cluster of GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  5. Hampton, S.S., Alam, S.R., Crozier, P.S., Agarwal, P.K.: Optimal utilization of heterogeneous resources for biomolecular simulations. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  6. Shimokawabe, T., Aoki, T., Takaki, T., Yamanaka, A., Nukada, A., Endo, T., Maruyama, N., Matsuoka, S.: Peta-scale Phase-Field Simulation for Dendritic Solidification on the TSUBAME 2.0 Supercomputer. In: Proceedings of the 2011 ACM/IEEE International Conference for High Performance Computing. IEEE Computer Society, Los Alamitos (2011)

    Google Scholar 

  7. Wen, M., Su, H., Wei, W., Wu, N., Cai, X., Zhang, C.: Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations. In: Proceedings of the IEEE Cluster, pp. 25–35. IEEE Computer Society, Los Alamitos (2012)

    Google Scholar 

  8. Clark, S.R., Wei, W., Cai, X.: Numerical analysis of a dual-sediment transport model applied to Lake Okeechobee, Florida. In: Proceedings of the 9th International Symposium on Parallel and Distributed Computing, pp. 189–194. IEEE Computer Society Press, Los Alamitos (2010)

    Google Scholar 

  9. Jordan, T.E., Flemmings, P.B.: Large-scale stratigraphic architecture, eustatic variation, and unsteady tectonism: a theorectical evaluation. J. Geophys. Res. 96(1), 6681–6699 (1991)

    Article  Google Scholar 

  10. Rivenæs, J.C.: A computer simulation model for siliclastic basin stratigraphy. Ph.D. thesis, University of Trondheim (1993)

  11. Wei, W., Clark, S.R., Su, H., Wen, M., Cai, X.: Balancing efficiency and accuracy for sediment transport simulations (2012). http://heim.ifi.uio.no/xingca/Wei-etal-2012-CG.pdf

  12. Schottler, S.P., Engstrom, D.R.: A chronological assessment of Lake Okeechobee (Florida) sediments using multiple dating markers. J. Paleolimnol. 36, 19–36 (2006)

    Article  Google Scholar 

  13. Reddy, K.R., Diaz, O.A., Scinto, L.J., Agami, M.: Phosphorus dynamics in selected wetlands and streams of the lake Okeechobee Basin. J. Ecol. Eng. 5, 183–207 (1995)

    Article  Google Scholar 

  14. Hill, G.W., DeWitt, N.T., Hansen, M.: Lake O-keechobee bathymetry data. Tech. rep. (2002) http://sofia.usgs.gov/publications/maps/lakeokeebathy/index.html

  15. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. 14(3), 189–204 (2000)

    Article  Google Scholar 

  16. http://www.top500.org/list/2011/11/100

Download references

Acknowledgements

The authors gratefully acknowledge the support from the National Natural Science Foundation of China under NSFC Nos. 61033008, 61103080 and 61272145, SRFDP No. 20104307110002 and 20124307130004, Innovation in Graduate School of NUDT No. B120605, Hunan Provincial Innovation Foundation for Postgraduate under No. CX2012B030, FriNatek program of the Research Council of Norway No. 214113/F20. Technical assistance from the National Supercomputing Center in Changsha is also acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huayou Su.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wen, M., Su, H., Wei, W. et al. High efficient sedimentary basin simulations on hybrid CPU-GPU clusters. Cluster Comput 17, 359–369 (2014). https://doi.org/10.1007/s10586-013-0300-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0300-9

Keywords

Navigation