Support OpenCL 2.0 Compiler on LLVM for PTX Simulators

Yang, Chun-Chieh; Wang, Shao-Chung; Hsu, Min-Yi; Chang, Yuan-Ming; Hwang, Yuan-Shin; Lee, Jenq-Kuen

doi:10.1007/s11265-018-1377-4

Support OpenCL 2.0 Compiler on LLVM for PTX Simulators

Published: 23 May 2018

Volume 91, pages 261–271, (2019)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Chun-Chieh Yang¹,
Shao-Chung Wang¹,
Min-Yi Hsu¹,
Yuan-Ming Chang¹,
Yuan-Shin Hwang² &
…
Jenq-Kuen Lee¹

297 Accesses
3 Citations
Explore all metrics

Abstract

Heterogeneous systems that consist of multiple CPUs and GPUs for high-performance computing are becoming increasingly popular, and OpenCL (Open Computing Language) provides a framework for writing programs that can be executed across heterogeneous devices. Compared with OpenCL 1.2, the new features of OpenCL 2.0 provide developers with better expressive power for programming heterogeneous computing environments. Currently, gem5-gpu, which includes gem5 and GPGPU-Sim, can offer an experimental simulation environment for OpenCL. In gem5-gpu, gem5 only supports CUDA, although GPGPU-Sim can support OpenCL by compiling an OpenCL kernel code to PTX code using real GPU drivers. However, this compilation flow in GPGPU-Sim can only support up to OpenCL 1.2. OpenCL 2.0 provides new features such as workgroup built-in functions, extended atomic built-in functions, and device-side enqueue. To support OpenCL 2.0, the compiler must be extended to enable the compilation of OpenCL 2.0 kernel code to PTX code. In this paper, the proposed compiler is modified from the low level virtual machine (LLVM) compiler to extend such features to enhance the emulator to support OpenCL 2.0. The proposed compiler creates local buffers for each workgroup to enable workgroup built-in functions and adds atomic built-in functions with memory order and memory scope for OpenCL 2.0 in NVPTX. Furthermore, the APIs available in CUDA are utilized to implement the OpenCL 2.0 device-side enqueue kernel and compilation schemes in Clang are revised. The AMD APP SDK 3.0 and NTU OpenCL benchmarks are used to verify that the proposed compiler can support the features of OpenCL 2.0.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

AMD OpenCL Accelerated Parallel Processing (APP). http://developer.amd.com/tools-and-sdks/.
Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M. (2009). Analyzing cuda workloads using a detailed gpu simulator. In IEEE International symposium on performance analysis of systems and software, 2009. ISPASS 2009 (pp. 163–174). IEEE.
Binkert, N., Beckmann, B., Black, G., Reinhardt, S.K., Saidi, A., Basu, A., Hestness, J., Hower, D.R., Krishna, T., Sardashti, S., et al. (2011). The gem5 simulator. ACM SIGARCH Computer Architecture News, 39(2), 1–7.
Article Google Scholar
CUDA Zone. https://developer.nvidia.com/cuda-zone.
GPGPU-Sim. http://www.gpgpu-sim.org/.
Khronos. https://www.khronos.org/.
Khronos OpenCL Resources. https://www.khronos.org/opencl/resources.
Lattner, C., & Adve, V. (2002). The llvm instruction set and compilation strategy. CS Dept. Univ. of Illinois at Urbana-Champaign, Tech. Report UIUCDCS.
Lattner, C., & Adve, V. (2004). Llvm: a compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization (p. 75). IEEE Computer Society.
libclc. http://libclc.llvm.org/.
opencl2.0-sim. https://github.com/ntueclab/opencl2.0-sim.
Power, J., Hestness, J., Orr, M.S., Hill, M.D., Wood, D.A. (2015). gem5-gpu: a heterogeneous cpu-gpu simulator. IEEE Computer Architecture Letters, 14(1), 34–36.
Article Google Scholar
Sharlet, D., Kunze, A., Junkins, S., Joshi, D. (2012). Shevlin park: implementing c++ amp with clang/llvm and opencl. In General Meeting of LLVM developers and users.
Seven OpenCL benchmarks for heterogeneous system architecture evaluation. http://mtkntu.ntu.edu.tw/upload/edmfs150404031052772.pdf.
The LLVM Compiler infrastructure. http://llvm.org/.
Wang, L., Tsai, R.-W., Wang, S.-C., Chen, K.-C., Wang, P.-H., Cheng, H.-Y., Lee, Y.-C., Shu, S.-J., Yang, C.-C., Hsu, M.-Y., Kan, L.-C., Lee, C.-L., Yu, T.-C., Peng, R.-D., Yang, C.-L., Hwang, Y.-S., Lee, J.-K., Tsao, S.-L., Ouhyoun, M. (2017). Analyzing opencl 2.0 workloads using a heterogeneous cpu-gpu simulator. In Accepter by ISPASS 2017 poster. IEEE.
Yang, C.-C., Wang, S.-C., Hsu, M.-Y., Chang, Y.-M., Hwang, Y.-S., Lee, J.-K. (2017). Opencl 2.0 compiler adaptation on llvm for ptx simulators. In Proceedings of the 2017 international workshop on embedded multicore systems (ICPP-EMS 2017) (pp. 53–58). IEEE.

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
Chun-Chieh Yang, Shao-Chung Wang, Min-Yi Hsu, Yuan-Ming Chang & Jenq-Kuen Lee
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Yuan-Shin Hwang

Authors

Chun-Chieh Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shao-Chung Wang
View author publications
You can also search for this author in PubMed Google Scholar
Min-Yi Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Ming Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Shin Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Jenq-Kuen Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenq-Kuen Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, CC., Wang, SC., Hsu, MY. et al. Support OpenCL 2.0 Compiler on LLVM for PTX Simulators. J Sign Process Syst 91, 261–271 (2019). https://doi.org/10.1007/s11265-018-1377-4

Download citation

Received: 08 September 2017
Revised: 19 March 2018
Accepted: 01 May 2018
Published: 23 May 2018
Issue Date: March 2019
DOI: https://doi.org/10.1007/s11265-018-1377-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Support OpenCL 2.0 Compiler on LLVM for PTX Simulators

Abstract

Access this article

Similar content being viewed by others

Co-designing OpenMP Features Using OMPT and Simulation Tools

FOTV: A Generic Device Offloading Framework for OpenMP

Gem5v: a modified gem5 for simulating virtualized systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Support OpenCL 2.0 Compiler on LLVM for PTX Simulators

Abstract

Access this article

Similar content being viewed by others

Co-designing OpenMP Features Using OMPT and Simulation Tools

FOTV: A Generic Device Offloading Framework for OpenMP

Gem5v: a modified gem5 for simulating virtualized systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation