Skip to main content
Log in

An OpenCL micro-benchmark suite for GPUs and CPUs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Open computing language (OpenCL) is a new industry standard for task-parallel and data-parallel heterogeneous computing on a variety of modern CPUs, GPUs, DSPs, and other microprocessor designs. OpenCL is vendor independent and hence not specialized for any particular compute device. To develop efficient OpenCL applications for the particular platform, we still need a more profound understanding of architecture features on the OpenCL model and computing devices. For this purpose, we design and implement an OpenCL micro-benchmark suite for GPUs and CPUs. In this paper, we introduce the implementations of our OpenCL micro benchmarks, and present the measuring results of hardware and software features like performance of mathematical operations, bus bandwidths, memory architectures, branch synchronizations and scalability, etc., on two multi-core CPUs, i.e. AMD Athlon II X2 250 and Intel Pentium Dual-Core E5400, and two different GPUs, i.e. NVIDIA GeForce GTX 460se and AMD Radeon HD 6850. We also compared the measuring results with existing benchmarks to demonstrate the reasonableness and correctness of our benchmark suite.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

References

  1. The OpenCL official site, at URL:http://www.khronos.org/opencl/

  2. Seo S, Jo G, Lee J (2011) Performance characterization of the NAS Parallel Benchmarks in OpenCL. In: Proceedings of 2011 IEEE International Symposium on Workload Characterization (IISWC), IEEE, pp 137–148

  3. Volkov V, Demmel JW (2008) Benchmarking GPUs to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press, USA, p 31

  4. Parboil Benchmark suite, at URL: http://impact.crhc.illinois.edu/parboil.php

  5. Che S, Boyer M, Meng J et al (2009) Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of IEEE International Symposium on Workload Characterization 2009 (IISWC 2009), IEEE, pp 44–54

  6. Torres Y, Gonzalez-Escribano A, Llanos DR (2013) uBench: exposing the impact of CUDA block geometry in terms of performance. J Supercomput 1–14

  7. Shen J et al (2012) Performance gaps between OpenMP and OpenCL for multi-core CPUs. In: Proceedings of 2012 41st international conference on parallel processing workshops (ICPPW), IEEE, pp 116–125

  8. Danalis A, Marin G, McCurdy C et al (2010) The scalable heterogeneous computing (SHOC) benchmark suite. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units, ACM, pp 63–74

  9. The OpenCL 1.2 specification, at URL: http://www.khronos.org/registry/cl/specs/opencl-1.2

  10. Torres Y, Gonzalez-Escribano A, Llanos DR (2011) Understanding the impact of CUDA tuning techniques for fermi. In: Proceedings of 2011 international conference on high performance computing and simulation (HPCS), IEEE

  11. Helluy P (2011) A portable implementation of the radix sort algorithm in OpenCL, at URL: http://hal.archives-ouvertes.fr/hal-00596730, Technical Report

  12. OpenCL Programming Guide Version 2.3. at URL: http://www.nvidia.com/content/cudazone/download/OpenCL/NVIDIA_OpenCL_ProgrammingGuide

  13. Peiyuan S, Xiaohua S (2012) An OpenCL approach of prestack Kirchhoff time migration algorithm on general purpose GPU. In: Proceedings of the 2012 13th international conference on parallel and distributed computing, applications and technologies, IEEE Computer Society

  14. Wong H, Papadopoulou MM, Sadooghi-Alvandi M et al (2010) Demystifying GPU microarchitecture through microbenchmarking. In: Proceedings of 2010 IEEE international symposium on performance analysis of systems & software (ISPASS), IEEE, pp 235–246

Download references

Acknowledgments

This material is based upon works supported by National Natural Science Foundation of China No.61073010 and No.61272166, National Science and Technology Major Project of China No.2012ZX01039-004, and the State Key Laboratory of Software Development Environment of China No.SKLSDE-2012ZX-02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohua Shi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, X., Shi, X., Wang, L. et al. An OpenCL micro-benchmark suite for GPUs and CPUs. J Supercomput 69, 693–713 (2014). https://doi.org/10.1007/s11227-014-1112-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-014-1112-2

Keywords

Navigation