EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS

Girish, Sharath; Gupta, Kamal; Shrivastava, Abhinav

doi:10.1007/978-3-031-73036-8_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15121))

Included in the following conference series:

European Conference on Computer Vision

352 Accesses
9 Citations

Abstract

Recently, 3D Gaussian splatting (3D-GS) has gained popularity in novel-view scene synthesis. It addresses the challenges of lengthy training times and slow rendering speeds associated with Neural Radiance Fields (NeRFs). Through rapid, differentiable rasterization of 3D Gaussians, 3D-GS achieves real-time rendering and accelerated training. They, however, demand substantial memory resources for both training and storage, as they require millions of Gaussians in their point cloud representation for each scene. We present a technique utilizing quantized embeddings to significantly reduce per-point memory storage requirements and a coarse-to-fine training strategy for a faster and more stable optimization of the Gaussian point clouds. Our approach develops a pruning stage which results in scene representations with fewer Gaussians, leading to faster training times and rendering speeds for real-time rendering of high resolution scenes. We reduce storage memory by more than an order of magnitude all while preserving the reconstruction quality. We validate the effectiveness of our approach on a variety of datasets and scenes preserving the visual quality while consuming 10–20$\times $ less memory and faster training/inference speed. Code is available here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization

SAGS: Structure-Aware 3D Gaussian Splatting

Compact 3D Scene Representation via Self-Organizing Gaussian Grids

References

Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018)
Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv preprint arXiv:1810.05723 (2018)
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-nerf: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-nerf 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Google Scholar
Bird, T., Ballé, J., Singh, S., Chou, P.A.: 3d scene compression through entropy penalized neural representation functions. In: 2021 Picture Coding Symposium (PCS), pp. 1–5. IEEE (2021)
Google Scholar
Chen, H., He, B., Wang, H., Ren, Y., Lim, S.N., Shrivastava, A.: Nerv: neural representations for videos. Adv. Neural. Inf. Process. Syst. 34, 21557–21568 (2021)
Google Scholar
Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning, pp. 2285–2294. PMLR (2015)
Google Scholar
Chen, W., Wilson, J., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressing convolutional neural networks in the frequency domain. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1475–1484 (2016)
Google Scholar
Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems, pp. 3123–3131 (2015)
Google Scholar
Dettmers, T., Lewis, M., Belkada, Y., Zettlemoyer, L.: Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339 (2022)
Dupont, E., Goliński, A., Alizadeh, M., Teh, Y.W., Doucet, A.: Coin: Compression with implicit neural representations. arXiv preprint arXiv:2103.03123 (2021)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018)
Frankle, J., Dziugaite, G.K., Roy, D.M., Carbin, M.: Pruning neural networks at initialization: Why are we missing the mark? arXiv preprint arXiv:2009.08576 (2020)
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)
Google Scholar
Girish, S., Gupta, K., Singh, S., Shrivastava, A.: Lilnetx: Lightweight networks with extreme model compression and structured sparsification. arXiv preprint arXiv:2204.02965 (2022)
Girish, S., Maiya, S.R., Gupta, K., Chen, H., Davis, L.S., Shrivastava, A.: The lottery ticket hypothesis for object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 762–771 (2021)
Google Scholar
Girish, S., Shrivastava, A., Gupta, K.: Shacira: scalable hash-grid compression for implicit neural representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 17513–17524 (2023)
Google Scholar
Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115 (2014)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626 (2015)
Hedman, P., Philip, J., Price, T., Frahm, J.M., Drettakis, G., Brostow, G.: Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (ToG) 37(6), 1–15 (2018)
Article Google Scholar
Hoaglin, D.C., Welsch, R.E.: The hat matrix in regression and anova. Am. Stat. 32(1), 17–22 (1978)
Article Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. (ToG) 42(4), 1–14 (2023)
Article Google Scholar
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. (ToG) 36(4), 1–13 (2017)
Article Google Scholar
LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: Advances in Neural Information Processing Systems, pp. 598–605 (1990)
Google Scholar
Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016)
Li, L., Shen, Z., Wang, Z., Shen, L., Bo, L.: Compressing volumetric radiance fields to 1 mb. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4222–4231 (2023)
Google Scholar
Luo, A., Du, Y., Tarr, M., Tenenbaum, J., Torralba, A., Gan, C.: Learning neural acoustic fields. Adv. Neural. Inf. Process. Syst. 35, 3165–3177 (2022)
Google Scholar
Maiya, S.R., et al.: Nirvana: Neural implicit representations of videos with adaptive networks and autoregressive patch-wise modeling. arXiv preprint arXiv:2212.14593 (2022)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Article Google Scholar
Niemeyer, M., et al.: Radsplat: Radiance field-informed gaussian splatting for robust real-time rendering with 900+ fps. arXiv preprint arXiv:2403.13806 (2024)
Oktay, D., Ballé, J., Singh, S., Shrivastava, A.: Scalable model compression by entropy penalized reparameterization. arXiv preprint arXiv:1906.06624 (2019)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Process. Syst. 32 (2019)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Reed, R.: Pruning algorithms-a survey. IEEE Trans. Neural Netw. 4(5), 740–747 (1993)
Article Google Scholar
Savarese, P., Silva, H., Maire, M.: Winning the lottery with continuous sparsification. Adv. Neural. Inf. Process. Syst. 33, 11380–11390 (2020)
Google Scholar
Seeley, R.T.: Spherical harmonics. Am. Math. Monthly 73(4P2), 115–121 (1966)
Google Scholar
Sitzmann, V., Chan, E., Tucker, R., Snavely, N., Wetzstein, G.: Metasdf: meta-learning signed distance functions. Adv. Neural. Inf. Process. Syst. 33, 10136–10147 (2020)
Google Scholar
Sitzmann, V., Martel, J., Bergman, A., Lindell, D., Wetzstein, G.: Implicit neural representations with periodic activation functions. Adv. Neural. Inf. Process. Syst. 33, 7462–7473 (2020)
Google Scholar
Strümpler, Y., Postels, J., Yang, R., Gool, L.V., Tombari, F.: Implicit neural representations for image compression. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part XXVI. pp. 74–91. Springer (2022). https://doi.org/10.1007/978-3-031-19809-0_5
Takikawa, T., et al.: Variable bitrate neural fields. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–9 (2022)
Google Scholar
Takikawa, T., et al.: Neural geometric level of detail: Real-time rendering with implicit 3d shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11358–11367 (2021)
Google Scholar
Tancik, M., et al.: Learned initializations for optimizing coordinate-based neural representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2846–2855 (2021)
Google Scholar
Ullman, S.: The interpretation of structure from motion. Proc. Royal Soc. London. Ser. B. Biol. Sci. 203(1153), 405–426 (1979)
Google Scholar
Yang, Y., Bamler, R., Mandt, S.: Improving inference for neural image compression. Adv. Neural. Inf. Process. Syst. 33, 573–584 (2020)
Google Scholar
Zhang, D., Yang, J., Ye, D., Hua, G.: LQ-Nets: learned quantization for highly accurate and compact deep neural networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 373–390. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_23
Chapter Google Scholar
Zwicker, M., Pfister, H., Van Baar, J., Gross, M.: Ewa volume splatting. In: Proceedings Visualization, VIS 2001 pp. 29–538. IEEE (2001)
Google Scholar
Zwicker, M., Pfister, H., Van Baar, J., Gross, M.: Surface splatting. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 371–378 (2001)
Google Scholar

Download references

Acknowledgements:

This work was partially supported by IARPA via Department of Interior/Interior Business Center (DOI/IBC) contract number 140D0423C0076. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The authors acknowledge UMD’s supercomputing resources made available for conducting this research. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

University of Maryland, College Park, USA
Sharath Girish, Kamal Gupta & Abhinav Shrivastava

Authors

Sharath Girish
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sharath Girish .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 123 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Girish, S., Gupta, K., Shrivastava, A. (2025). EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15121. Springer, Cham. https://doi.org/10.1007/978-3-031-73036-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-73036-8_4
Published: 21 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73035-1
Online ISBN: 978-3-031-73036-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS