A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing

Kim, Mingyu; Baek, Nakhoon

doi:10.1007/s11227-020-03581-8

A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing

Published: 04 January 2021

Volume 77, pages 7351–7367, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

891 Accesses
10 Citations
Explore all metrics

Abstract

Recently, massively-parallel computing libraries and devices are much widely used, in addition to the traditional 3D graphics systems. In this paper, we present a full 3D fixed-function graphics pipeline, based on the OpenCL, which is one of the most widely used massively-parallel computing library. The full 3D graphics features including WebGL, Web3D and others can be implemented on the massively-parallel computations, without underlying 3D graphics hardware support. Many previous works focused on another massively-parallel system of CUDA, which has a drawback of limited availability. In contrast, we designed and implemented a new architecture with OpenCL, which is now available on various computing devices, including most CPUs, GPUs, and at least theoretically, special-purpose embedded FPGAs. Our work provides full 3D graphics features on OpenCL-capable systems, without dedicated 3D graphics hardware, to finally make 3D graphics features ubiquitous. Technically, we used a top-down approach in its rendering, from the whole screen to precise pixels. At each stage, we tuned our OpenCL implementations and also their global and local parameter spaces. We present the details of our design and also the final result of our implementation, and show its correctness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Parallel Implementation of 3D Graphics Pipeline

The Design and Prototype Implementation of a Pipelined Heterogeneous Multi-core GPU

Advanced 2D Rasterization on Modern CPUs

References

2nd Gen Ryzen 7: https://www.amd.com/en/products/cpu/amd-ryzen-7-2700x (retieved in Sep 2020)
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) TensorFlow: A system for large-scale machine learning. In: Proc of the 12th USENIX Conference on Operating Systems Design and Implementation. OSDI’16. USENIX Association, USA, pp 265–283
AMD Radeon RX Vega 56 Specs: https://www.techpowerup.com/gpu-specs/radeon-rx-vega-56.c2993 (retieved in Sep 2020)
AMD Ryzen 7 2700X GFLOPS: https://gadgetversus.com/processor/amd-ryzen-7-2700x-gflops-performance/ (retieved in Sep 2020)
Baek N, Kim K (2019) Design and implementation of OpenGL SC 2.0 rendering pipeline. Cluster Comput 22:931–936
Article Google Scholar
Boyd C (2007) The Direct3D 10 pipeline. In: ACM SIGGRAPH 2007 Courses, SIGGRAPH ’07, pp. 52–109. ACM
Colonna G, Bonelli F, Pascazio G (2019) Impact of fundamental molecular kinetics on macroscopic properties of high-enthalpy flows: the case of hypersonic atmospheric entry. Phys Rev Fluids 4(3):033404
Article Google Scholar
Corso AD, Salvi M, Kolb C, Frisvad JR, Lefohn A, Luebke D (2017) Interactive stable ray tracing. In: Proc of High Performance Graphics, HPG ’17, p. Article 1. ACM
Fernando R (2004) GPU gems: programming techniques, tips and tricks for real-time graphics. Addison-Wesley Professional, Boston
Google Scholar
Gambhir M, Panda S, Basha SJ (2018) Vulkan rendering framework for mobile multimedia. In: SIGGRAPH Asia 2018 Posters, SA ’18. ACM
GeForce RTX 2060 Graphics Card: https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2060/ (retrieved in Sep 2020)
Gervasi O, Russo D, Vella F (2010) The AES implantation based on OpenCL for multi/many core architecture. In: 2010 Int’l Conf on Computational Science and Its Applications, pp. 129–134
Gkeka M.R, Bellas N, Antonopoulos C.D(2019) Comparative performance analysis of Vulkan implementations of computational applications. In: Proc of the Int’l Workshop on OpenCL, IWOCL ’19, p. Article 6. ACM
Heinecke A, Trinitis C, Weidendorfer J(2010) Porting existing cache-oblivious linear algebra HPC modules to Larrabee architecture. In: Proc of the 7th ACM Int’l Conf on Computing Frontiers, CF ’10, pp. 91–92. ACM
Hughes JF et al (2013) Computer graphics: principles and practice: principles and practices. Addison-Wesley Professional, Boston
Google Scholar
Intel: Intel FPGA SDK for OpenCL software technology (retrieved on October 07, 2020)
Intel UHD Graphics 620: https://ark.intel.com/content/www/us/en/ark/products/graphics/126789/intel-uhd-graphics-620.html (retrieved in Sep 2020)
Intel UHD Graphics 620 Specs: https://www.techpowerup.com/gpu-specs/uhd-graphics-620.c2909 (retrieved in Sep 2020)
Iqbal U et al (2016) Cancer-disease associations: a visualization and animation through medical big data. Comp Methods Programs Biomed 127:44–51
Article Google Scholar
Karimi K, Dickson N.G, Hamze F (2010) A performance comparison of CUDA and OpenCL. arXiv preprint arXiv:1005.2581
Kenzel M, Kerbl B, Schmalstieg D, Steinberger D (2018) A high-performance software graphics pipeline architecture for the GPU. ACM Trans Graph 37(4):140:1–140:15
Article Google Scholar
Kerbl B, Kenzel M, Schmalstieg D, Steinberger M (2017) Effective static bin patterns for sort-middle rendering. In: Proc of High Performance Graphics, HPG ’17. ACM
Kessenich J, Sellers G, Shreiner D (2016) OpenGL Programming Guide. Addison-Wesley Professional, Boston
Google Scholar
Khronos OpenCL Working Group (2012) The OpenCL Specification version 1.2. Khronos Group, Beaverton
Google Scholar
Khronos OpenCL Working Group (2019) The OpenCL Specification version 2.2. Khronos Group, Beaverton
Google Scholar
Khronos OpenCL Working Group: OpenCL overview (retrieved on August 10, 2020)
Kirk D (2007) NVIDIA CUDA software and GPU parallel computing architecture. In: Proc of the 6th Int’l Symp on Memory Management, ISMM ’07, pp. 103–104. ACM
Kwon Y.C, Baek N (2014) A CUDA-based implementation of OpenGL-compatible rasterization library prototype. In: Proc of the 29th Annual ACM Symp on Applied Computing, SAC ’14, pp. 1747–1748. ACM
Laine S, Karras T (2011) High-performance software rasterization on GPUs. In: Proc of the ACM SIGGRAPH Symp on High Performance Graphics, HPG ’11, pp. 79–88. ACM
Liu F, Huang M, Liu X, Wu E.H (2010) Coherent depth test scheme in FreePipe. In: Proc of the 9th ACM SIGGRAPH Conf on Virtual-Reality Continuum and Its Applications in Industry, VRCAI ’10, pp. 265–270. ACM
Liu F, Huang M.C, Liu X.H, Wu E.H (2009) CUDA renderer: A programmable graphics pipeline. In: ACM SIGGRAPH ASIA 2009 Sketches, SIGGRAPH ASIA ’09. ACM
Liu F, Huang M.C, Liu X.H, Wu E.H (2010) FreePipe: A programmable parallel rendering architecture for efficient multi-fragment effects. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’10, pp. 75–82. ACM
Luna F (2016) Introduction to 3D game programming with directX 12. Mercury Learning & Information, Herndon VA, United States
Google Scholar
Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming productivity, performance, and energy consumption. In: Proc of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, ARMS-CC ’17, pp. 1–6. ACM
Memeti S, Li L, Pllana S, Kołodziej J, Kessler C (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pp. 1–6
Mesa Team: The Mesa 3D graphics library (retrieved on August 10, 2020)
Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6(2):40–53
Article Google Scholar
NVIDIA (2019) CUDA Toolkit Documentation version 10.2. NVIDIA, Santa Clara
Google Scholar
NVIDIA GeForce RTX 2060 Specs: https://www.techpowerup.com/gpu-specs/geforce-rtx-2060.c3310 (retrieved in Sep 2020)
Olsson O, Billeter M, Assarsson, U (2012) Tiled and clustered forward shading: Supporting transparency and MSAA. In: ACM SIGGRAPH 2012 Talks, SIGGRAPH ’12. ACM
Patney A, Tzeng S, Seitz KA, Owens JD (2014) Piko: a design framework for programmable graphics pipelines. http://arXiv.org14046293
Patney A, Tzeng S, Seitz KA, Owens JD (2015) Piko: a framework for authoring programmable graphics pipelines. ACM Trans Graph 34(4):1–13
Article Google Scholar
Perkins H CUDA-on-CL:(2017) A compiler and runtime for running NVIDIA CUDA C++11 applications on OpenCL 1.2 devices. In: Proc of the 5th Int’l Workshop on OpenCL, IWOCL 2017. ACM
Pratt H, Coenen F, Broadbent DM, Harding SP, Zheng Y (2016) Convolutional neural networks for diabetic retinopathy. Proc Comp Sci 90:200–205
Article Google Scholar
Radeon RX Vega 56 Graphics Card: https://www.amd.com/en/products/graphics/radeon-rx-vega-56 (retrieved in Sep 2020)
Sanchez D, Lo D, Yoo RM, Sugerman J, Kozyrakis C (2011) Dynamic fine-grain scheduling of pipeline parallelism. In: 2011 Int’l Conf on Parallel Architectures and Compilation Techniques, pp. 22–32. IEEE
Sechelea A, Do Huu T, Zimos E, Deligiannis N (2016) Twitter data clustering and visualization. In: 2016 23rd Int’l Conf on Telecommunications (ICT), pp. 1–5
Segal M, Akeley K (2019) The OpenGL graphics system: a specification. Khronos Group, Beaverton
Google Scholar
Seiler L et al (2009) Larrabee: a many-core x86 architecture for visual computing. IEEE Micro 29:10–21
Article Google Scholar
Stone JE, Gohara D, Shi G (2010) OpenCL: a parallel programming standard for heterogeneous computing systems. Comput Sci Eng 12(3):66
Article Google Scholar
Suda N et al (2016) Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proc of the 2016 ACM/SIGDA Int’l Symp on Field-Programmable Gate Arrays, FPGA ’16, pp. 16–25. ACM
Sugerman J et al (2009) GRAMPS: a programming model for graphics pipelines. ACM Trans Graph 28(1):1–11
Article Google Scholar
The Khronos Vulkan working group (2020) Vulkan–A Specification. Khronos Group, Beaverton
Google Scholar
Valero A, Gracia DS, Tejero RG, Ramos LM, Navarro-Torres A, Muñoz A, Ezpeleta J, Briz JL, Murillo AC, Montijano E, et al. (2019) Exposing abstraction-level interactions with a parallel ray tracer. In: Proc of the Workshop on Computer Architecture Education, WCAE ’19, p. Article 5. ACM, New York, NY, USA
Venkatesh N, Ejorh D. Introduction to OpenCL on FPGAs. https://www.coursera.org/learn/opencl-fpga-introduction (retrieved on October 07, 2020)
Wald I (2014) High fidelity visualization
Welstead ST (1999) Fractal and wavelet image compression techniques. SPIE Publication, Bellingham
Book Google Scholar
Xilinux: Developing and optimizing applications using the opencl framework (retrieved on October 07, 2020)
Xilinx: OpenCL devices and FPGAs (retrieved on October 07, 2020)
Zhou Wang, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar

Download references

Acknowledgements

This work has supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grand No. NRF-2019R1I1A3A01061310).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Kyungpook National University, Daegu, 41566, Republic of Korea
Mingyu Kim & Nakhoon Baek

Authors

Mingyu Kim
View author publications
You can also search for this author in PubMed Google Scholar
Nakhoon Baek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nakhoon Baek.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grand No. NRF-2019R1I1A3A01061310).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, M., Baek, N. A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing. J Supercomput 77, 7351–7367 (2021). https://doi.org/10.1007/s11227-020-03581-8

Download citation

Accepted: 15 December 2020
Published: 04 January 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11227-020-03581-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing

Abstract

Access this article

Similar content being viewed by others

A Parallel Implementation of 3D Graphics Pipeline

The Design and Prototype Implementation of a Pipelined Heterogeneous Multi-core GPU

Advanced 2D Rasterization on Modern CPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A 3D graphics rendering pipeline implementation based on the openCL massively parallel processing

Abstract

Access this article

Similar content being viewed by others

A Parallel Implementation of 3D Graphics Pipeline

The Design and Prototype Implementation of a Pipelined Heterogeneous Multi-core GPU

Advanced 2D Rasterization on Modern CPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation