Complex shading efficiently for ray tracing on GPU

Yang, Xin; Xu, Duan-qing; Zhao, Lei; Yang, Bing

doi:10.1007/s11042-013-1712-5

Complex shading efficiently for ray tracing on GPU

Published: 03 October 2013

Volume 74, pages 1091–1106, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xin Yang¹,
Duan-qing Xu²,
Lei Zhao² &
…
Bing Yang³

312 Accesses
Explore all metrics

Abstract

Complex shading often associates with long shaders and huge data access. To obtain good performance on current generation GPU hardware, it is necessary to design some algorithms to manage data, schedule more efficient threads, and memory access under the hierarchy of GPU memory. In this paper, we propose an approach to accelerate the rendering process for complex shaders by analyzing and sorting shading jobs according to their complexity and potential memory access. We show that by sorting these shading jobs in three levels of memory hierarchies and reorganizing threads block according to the complexity, all shading jobs are scheduled in order, and we can significantly improve cache utilization and GPU hardware utilization, especially for poor performance caused by large branching. All sorting work are processed on CPU with plentiful logic function, and can be processed in a very efficient manner, compared with the expensive compaction operation on GPU. Our experiments with this hierarchy demonstrate improvements against a SIMD packet tracing with compaction on GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MSKD: multi-split KD-tree design on GPU

Article 22 November 2014

On the Efficient Implementation of a Real-Time Kd-Tree Construction Algorithm

Macro 64-regions for uniform grids on GPU

Article 16 May 2014

References

AMD (2008) ATI stream computing. AMD Developer Website. http://ati.amd.com/technology/streamcomputing/. Accessed June 2008
Bennett K (2009) NVIDIA’s “Fermi” architecture white paper. Nvidia Developer Website. http://www.hardocp.com/article/2009/09/30/nvidias_fermi_architecture_white_paper/. Accessed July 2009
Boulos S, Edwards D, Lacewell JD, et al. (2007) Packet-based whitted and distribution ray tracing. In: Proc. Graphics Interface 2007. Montreal, Canada, pp 177–184
Choi B, Komuravelli R, Lu V, et al. (2010) Parallel SAH k-D tree construction. In: Proc. of the Conference on High Performance Graphics. Saarbrucken, Germany, pp 77–86
Dammertz H, Hanika J, Keller A (2008) Shallow bounding volume hierarchies for fast SIMD ray tracing of incoherent rays. Comput Graph Forum 27(4):1225–1233
Article Google Scholar
Deering M, Winner S, Schediwy B et al (1988) The triangle processor and normal vector shader: a VLSI system for high performance graphics. Comput Graph 22(4):21–31
Article Google Scholar
Henry W (2010) Demystifying GPU microarchitecture through microbenchmarking. In: Proc. IEEE International Symposium on Performance Analysis of Systems & Software, 28–30 March 2010, pp 235–246
Hoberock J, Lu V, Jia Y, et al. (2009) Stream compaction for deferred shading. Proceedings of the Conference on High Performance Graphics, New Orleans, Louisiana, pp 173–180
Lindholm E, Nickolls J, Oberman S et al (2008) NVIDIA Tesla: a unified graphics and computing architecture. IEEE Micro 28(2):39–55
Article Google Scholar
Mansson E., Munkberg J. and Akenine-Molle TR (2007) Deep coherent ray tracing. In: Proc. of 2007 I.E. Symposium on Interactive Ray Tracing. Ulm, Germany, pp 79–85
Overbeck R, Ramamoorthi R, Mark WR (2008) Large ray packets for real-time whitted ray tracing. In: Proc. of IEEE/EG Symposium on Interactive Ray Tracing. Los Angeles, California, USA, pp 41–48
Pharr M, Kolb C, Gershbein R, et al. (1997) Rendering complex scenes with memory-coherent ray tracing. In: Proc. of the 24th annual Conference on Computer graphics and interactive techniques. Los Angeles, California, USA, pp 101–108
Reshetov A (2006) Omnidirectional ray tracing traversal algorithm for kd-trees. In: Proc. of IEEE Symposium on Interactive Ray Tracing. Salt Lake City, Utah, USA, pp 57–60
Reshetov A (2007) Faster ray packets-triangle intersection through vertex culling. In: Proc. of ACM SIGGRAPH 2007 Posters. San Diego, California, USA, p 171
Sengupta S, Harris M, Zhang Y, et al. (2007) Scan primitives for GPU computing. In: Proc. of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics hardware. San Diego, California, USA, pp 97–106
Shih M, Chiu YF, Chen YC, et al. (2009) Real-Time Ray Tracing with CUDA. In: Proc. of the 9th International Conference on Algorithms and Architectures for Parallel Processing. Taipei, Taiwan, pp 327–337
Wald I, Benthin C, Boulos S (2008) Getting rid of packets: efficient SIMD single-ray traversal using multibranching BVHs. In: Proc. of IEEE/Eurographics Symposium on Interactive Ray Tracing. Los Angeles, California, USA, pp 49–57
Wald I, Boulos S, Shirley P (2007) Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Trans Graph 26(1):6
Google Scholar
Wald I, Gribble CP, Boulos S, et al. (2007) SIMD Ray Stream Tracing-SIMD ray traversal with generalized ray packets and on-the-fly re-ordering. Technical Report #UUSCI-2007-012
Wald I, Slusallek P, Benthin C et al (2001) Interactive rendering with coherent ray tracing. Comput Graph Forum 20(3):153–164
Article Google Scholar
Zlatuška M, Havran V (2010) Ray Tracing on a GPU with CUDA-Comparative Study of Three Algorithms. In: Proc. of 18th International Conference on Computer Graphics, Visualization and Computer Vision. Czech Republic, pp 69–76

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for the careful reading of the original manuscript. Their comments and suggestions have led to a much better presentation of the paper. This research is supported in part by the National Natural Science Foundation of China under Grant Nos. 61300084, in part by Grant of China Postdoctoral Science Foundation under Grant No.2012M520625, and Scientific Research Foundation of Dalian University of Technology under Grant DUT12RC(3)63. The authors also appreciate the support of the Nvidia and Microsoft corporations.

Author information

Authors and Affiliations

College of Computer Science, Dalian University of Technology, Dalian, China
Xin Yang
College of Computer Science, Zhejiang University, Hangzhou, China
Duan-qing Xu & Lei Zhao
School of Computer Science, Hangzhou Dianzi University, Hangzhou, China
Bing Yang

Authors

Xin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Duan-qing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Bing Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, X., Xu, Dq., Zhao, L. et al. Complex shading efficiently for ray tracing on GPU. Multimed Tools Appl 74, 1091–1106 (2015). https://doi.org/10.1007/s11042-013-1712-5

Download citation

Published: 03 October 2013
Issue Date: February 2015
DOI: https://doi.org/10.1007/s11042-013-1712-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Complex shading efficiently for ray tracing on GPU

Abstract

Access this article

Similar content being viewed by others

MSKD: multi-split KD-tree design on GPU

On the Efficient Implementation of a Real-Time Kd-Tree Construction Algorithm

Macro 64-regions for uniform grids on GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Complex shading efficiently for ray tracing on GPU

Abstract

Access this article

Similar content being viewed by others

MSKD: multi-split KD-tree design on GPU

On the Efficient Implementation of a Real-Time Kd-Tree Construction Algorithm

Macro 64-regions for uniform grids on GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation