skip to main content
10.1145/3620666.3651385acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open Access

GSCore: Efficient Radiance Field Rendering via Architectural Support for 3D Gaussian Splatting

Published:27 April 2024Publication History

ABSTRACT

This paper presents GSCore, a hardware acceleration unit that efficiently executes the rendering pipeline of 3D Gaussian Splatting with algorithmic optimizations. GSCore builds on the observations from an in-depth analysis of Gaussian-based radiance field rendering to enhance computational efficiency and bring the technique to wide adoption. In doing so, we present several optimization techniques, Gaussian shape-aware intersection test, hierarchical sorting, and subtile skipping, all of which are synergistically integrated with GSCore. We implement the hardware design of GSCore, synthesize it using a commercial 28nm technology, and evaluate the performance across a range of synthetic and real-world scenes with varying image resolutions. Our evaluation results show that GSCore achieves a 15.86× speedup on average over the mobile consumer GPU with a substantially smaller area and lower energy consumption.

References

  1. Nvidia ada gpu architecture. Technical report, NVIDIA, 2023.Google ScholarGoogle Scholar
  2. ADLINK. Nvidia jetson xavier nx-based ai vision system, 2022. https://www.adlinktech.com/en/news/nvidia-xavier-nxbased-poe-ai-vision-system.Google ScholarGoogle Scholar
  3. ARM. Immortalis-g715, 2022. https://www.arm.com/products/silicon-ip-multimedia/immortalis-gpu/immortalis-g715.Google ScholarGoogle Scholar
  4. Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.Google ScholarGoogle Scholar
  5. John Burgess. Rtx on---the nvidia turing gpu. IEEE Micro, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  6. Karthik Chandrasekar, Christian Weis, Yonghui Li, Sven Goossens, Matthias Jung, Omar Naji, Benny Akesson, Norbert Wehn, and Kees Goossens. Drampower: Open-source dram power and energy estimation tool. http://www.drampower.info.Google ScholarGoogle Scholar
  7. Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. In European Conference on Computer Vision (ECCV), 2022.Google ScholarGoogle Scholar
  8. Yu-Hsin Chen, Joel Emer, and Vivienne Sze. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA), 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhiqin Chen, Thomas Funkhouser, Peter Hedman, and Andrea Tagliasacchi. Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.Google ScholarGoogle ScholarCross RefCross Ref
  11. Yuan Hsi Chou, Tyler Nowicki, and Tor M. Aamodt. Treelet prefetching for ray tracing. In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Matthew Eldridge, Homan Igehy, and Pat Hanrahan. Pomegranate: a fully scalable graphics architecture. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2000.Google ScholarGoogle Scholar
  13. Blender Foundation. Blender engine. https://www.blender.org.Google ScholarGoogle Scholar
  14. Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, and Doug Burger. A configurable cloud-scale dnn processor for real-time ai. In Proceedings of the 45th Annual International Symposium on Computer Architecture (ISCA), 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.Google ScholarGoogle ScholarCross RefCross Ref
  16. Epic Games. Unreal engine. https://www.unrealengine.com/en-US.Google ScholarGoogle Scholar
  17. Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien P. C. Valentin. Fastnerf: High-fidelity neural rendering at 200fps. In IEEE/CVF International Conference on Computer Vision (ICCV), 2021.Google ScholarGoogle ScholarCross RefCross Ref
  18. Andrew S. Glassner, editor. An introduction to ray tracing. Academic Press Ltd., 1989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stefan Gottschalk, M. C. Lin, and Dinesh Manocha. Obbtree: a hierarchical structure for rapid interference detection. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ayub A. Gubran and Tor M. Aamodt. Emerald: graphics modeling for soc systems. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA), 2019.Google ScholarGoogle Scholar
  21. Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. Eie: Efficient inference engine on compressed deep neural network. In Proceedings of the 43rd Annual International Symposium on Computer Architecture (ISCA), 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Peter Hedman, Julien Philip, True Price, Jan-Michael Frahm, George Drettakis, and Gabriel Brostow. Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics (SIGGRAPH), 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M.F. Ionescu and K.E. Schauser. Optimizing parallel bitonic sort. In Proceedings 11th International Parallel Processing Symposium (IPPS), 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Norm Jouppi, George Kurian, Sheng Li, Peter Ma, Rahul Nagarajan, Lifeng Nai, Nishant Patil, Suvinay Subramanian, Andy Swing, Brian Towles, Clifford Young, Xiang Zhou, Zongwei Zhou, and David A Patterson. Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings. In Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (SIGGRAPH), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yoongu Kim, Weikun Yang, and Onur Mutlu. Ramulator: A fast and extensible dram simulator. IEEE Computer Architecture Letters (CAL), 2016.Google ScholarGoogle Scholar
  27. Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (SIGGRAPH), 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Junseo Lee, Kwanseok Choi, Jungi Lee, Seokwon Lee, Joonho Whangbo, and Jaewoong Sim. Neurex: A case for neural rendering acceleration. In Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sixu Li, Chaojian Li, Wenbo Zhu, Boyang (Tony) Yu, Yang (Katie) Zhao, Cheng Wan, Haoran You, Huihong Shi, and Yingyan (Celine) Lin. Instant-3d: Instant neural radiance field training towards on-device ar/vr 3d reconstruction. In Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields. In Conference on Neural Information Processing Systems (NeurIPS), 2020.Google ScholarGoogle Scholar
  31. Lufei Liu, Wesley Chang, Francois Demoullin, Yuan Hsi Chou, Mohammadreza Saed, David Pankratz, Tyler Nowicki, and Tor M. Aamodt. Intersection prediction for accelerated gpu ray tracing. In Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Daniel Meister, Jakub Boksansky, Michael Guthe, and Jiri Bittner. On ray reordering techniques for faster gpu ray tracing. In Symposium on Interactive 3D Graphics and Games (I3D), 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Duncan J.M Moss, Srivatsan Krishnan, Eriko Nurvitadhi, Piotr Ratuszniak, Chris Johnson, Jaewoong Sim, Asit Mishra, Debbie Marr, Suchit Subhaschandra, and Philip H.W. Leong. A customizable matrix multiplication framework for the intel harpv2 xeon+fpga platform: A deep learning case study. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Muhammad Husnain Mubarik, Ramakrishna Kanungo, Tobias Zirr, and Rakesh Kumar. Hardware acceleration of neural graphics. In Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (SIGGRAPH), 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jae-Ho Nah, Hyuck-Joo Kwon, Dong-Seok Kim, Cheol-Ho Jeong, Jinhong Park, Tack-Don Han, Dinesh Manocha, and Woo-Chan Park. Raycore: A ray-tracing hardware architecture for mobile devices. ACM Transactions on Graphics (TOG), 2014.Google ScholarGoogle Scholar
  38. Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Ong Gee Hock, Yeong Tat Liew, Krishnan Srivatsan, Duncan Moss, Suchit Subhaschandra, and Guy Boudoukh. Can fpgas beat gpus in accelerating next-generation deep neural networks? In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. NVIDIA. NVIDIA Xavier System-on-Chip, HotChips 30, 2018.Google ScholarGoogle Scholar
  40. Matt Pharr, Craig Kolb, Reid Gershbein, and Pat Hanrahan. Rendering complex scenes with memory-coherent ray tracing. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Timothy J. Purcell, Ian Buck, William R. Mark, and Pat Hanrahan. Ray tracing on programmable graphics hardware. ACM Transactions on Graphics (TOG), 21(3), 2002.Google ScholarGoogle Scholar
  42. Qualcomm. Adreno gpu, 2023. https://www.qualcomm.com/news/onq/2023/05/hardware-accelerated-ray-tracing-improves-lighting-effects-in-mobile-gaming.Google ScholarGoogle Scholar
  43. Ravi Ramamoorthi and Pat Hanrahan. An efficient representation for irradiance environment maps. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Brandon Reagen, Paul Whatmough, Robert Adolf, Saketh Rama, Hyunkwang Lee, Sae Kyu Lee, José Miguel Hernández-Lobato, Gu-Yeon Wei, and David Brooks. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA), 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash, Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, and Pat Hanrahan. Larrabee: a many-core x86 architecture for visual computing. ACM Transactions on Graphics (TOG), 2008.Google ScholarGoogle Scholar
  46. Noah Snavely, Steven M. Seitz, and Richard Szeliski. Photo tourism: Exploring photo collections in 3d. ACM Transactions on Graphics (SIGGRAPH), 2006.Google ScholarGoogle Scholar
  47. Xinkai Song, Yuanbo Wen, Xing Hu, Tianbo Liu, Haoxuan Zhou, Husheng Han, Tian Zhi, Zidong Du, Wei Li, Rui Zhang, Chen Zhang, Lin Gao, Qi Guo, and Tianshi Chen. Cambricon-r: A fully fused accelerator for real-time learning of neural scene representation. In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.Google ScholarGoogle ScholarCross RefCross Ref
  49. Unity Technologies. Unity engine. https://unity.com/products/unity-engine.Google ScholarGoogle Scholar
  50. Blaise Tine, Varun Saxena, Santosh Srivatsan, Joshua R. Simpson, Fadi Alzammar, Liam Cooper, and Hyesoon Kim. Skybox: Open-source graphic rendering on programmable risc-v gpus. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, and Kim Hyesoon. Vortex: Extending the risc-v isa for gpgpu and 3d-graphics. In Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GSCore: Efficient Radiance Field Rendering via Architectural Support for 3D Gaussian Splatting

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
        April 2024
        1106 pages
        ISBN:9798400703867
        DOI:10.1145/3620666

        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 April 2024

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate535of2,713submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)146
        • Downloads (Last 6 weeks)146

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader