End-to-End Rate-Distortion Optimized 3D Gaussian Representation

Wang, Henan; Zhu, Hanxin; He, Tianyu; Feng, Runsen; Deng, Jiajun; Bian, Jiang; Chen, Zhibo

doi:10.1007/978-3-031-73636-0_5

Henan Wang¹³,
Hanxin Zhu¹³,
Tianyu He¹⁴,
Runsen Feng¹³,
Jiajun Deng¹⁵,
Jiang Bian¹⁴ &
…
Zhibo Chen¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15116))

Included in the following conference series:

European Conference on Computer Vision

368 Accesses
1 Citations

Abstract

3D Gaussian Splatting (3DGS) has become an emerging technique with remarkable potential in 3D representation and image rendering. However, the substantial storage overhead of 3DGS significantly impedes its practical applications. In this work, we formulate the compact 3D Gaussian learning as an end-to-end Rate-Distortion Optimization (RDO) problem and propose RDO-Gaussian that can achieve flexible and continuous rate control. RDO-Gaussian addresses two main issues that exist in current schemes: 1) Different from prior endeavors that minimize the rate under the fixed distortion, we introduce dynamic pruning and entropy-constrained vector quantization (ECVQ) that optimize the rate and distortion at the same time. 2) Previous works treat the colors of each Gaussian equally, while we model the colors of different regions and materials with learnable numbers of parameters. We verify our method on both real and synthetic scenes, showcasing that RDO-Gaussian greatly reduces the size of 3D Gaussian over 40$\times $, and surpasses existing methods in rate-distortion performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

References

Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimization of nonlinear transform codes for perceptual quality. In: 2016 Picture Coding Symposium (PCS), pp. 1–5. IEEE (2016)
Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (2017)
Google Scholar
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: ICCV (2021)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Google Scholar
Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: Tensorf: tensorial radiance fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 333–350. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20
Chapter Google Scholar
Chou, P.A., Lookabaugh, T., Gray, R.M.: Entropy-constrained vector quantization. IEEE Trans. Acoust. Speech Signal Process. 37(1), 31–42 (1989)
Article MathSciNet Google Scholar
Deng, C.L., Tartaglione, E.: Compressing explicit voxel grid representations: fast nerfs become also small. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1236–1245 (2023)
Google Scholar
Duan, Y., Wei, F., Dai, Q., He, Y., Chen, W., Chen, B.: 4D gaussian splatting: towards efficient novel view synthesis for dynamic scenes. arXiv preprint arXiv:2402.03307 (2024)
Fan, Z., Wang, K., Wen, K., Zhu, Z., Xu, D., Wang, Z.: LightGaussian: Unbounded 3D gaussian compression with 15x reduction and 200+ FPS. arXiv preprint arXiv:2311.17245 (2023)
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)
Google Scholar
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression, vol. 159. Springer, Heidelberg (2012)
Google Scholar
Girish, S., Shrivastava, A., Gupta, K.: SHACIRA: scalable HAsh-grid compression for implicit neural representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 17513–17524 (2023)
Google Scholar
Graziosi, D., Nakagami, O., Kuma, S., Zaghetto, A., Suzuki, T., Tabatabai, A.: An overview of ongoing point cloud compression standardization activities: video-based (V-PCC) and geometry-based (G-PCC). APSIPA Trans. Signal Inf. Process. 9, e13 (2020)
Article Google Scholar
Hedman, P., Philip, J., Price, T., Frahm, J.M., Drettakis, G., Brostow, G.: Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (ToG) 37(6), 1–15 (2018)
Article Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Google Scholar
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. (ToG) 36(4), 1–13 (2017)
Article Google Scholar
Lee, J.C., Rho, D., Sun, X., Ko, J.H., Park, E.: Compact 3D gaussian representation for radiance field. arXiv preprint arXiv:2311.13681 (2023)
Li, L., Shen, Z., Wang, Z., Shen, L., Bo, L.: Compressing volumetric radiance fields to 1 MB. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4222–4231 (2023)
Google Scholar
Liu, X., et al.: HumanGaussian: text-driven 3D human generation with gaussian splatting. arXiv preprint arXiv:2311.17061 (2023)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Morgenstern, W., Barthel, F., Hilsmann, A., Eisert, P.: Compact 3D scene representation via self-organizing Gaussian grids. arXiv preprint arXiv:2312.13299 (2023)
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Article Google Scholar
Navaneet, K., Meibodi, K.P., Koohpayegani, S.A., Pirsiavash, H.: Compact3D: compressing gaussian splat radiance field models with vector quantization. arXiv preprint arXiv:2311.18159 (2023)
Peng, S., Jiang, C., Liao, Y., Niemeyer, M., Pollefeys, M., Geiger, A.: Shape as points: a differentiable Poisson solver. Adv. Neural. Inf. Process. Syst. 34, 13032–13044 (2021)
Google Scholar
Reiser, C., et al.: MeRF: memory-efficient radiance fields for real-time view synthesis in unbounded scenes. ACM Trans. Graph. (TOG) 42(4), 1–12 (2023)
Article Google Scholar
Rho, D., Lee, B., Nam, S., Lee, J.C., Ko, J.H., Park, E.: Masked wavelet representation for compact neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20680–20690 (2023)
Google Scholar
Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469 (2022)
Google Scholar
Takikawa, T., et al.: Variable bitrate neural fields. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–9 (2022)
Google Scholar
Tang, J., Chen, X., Wang, J., Zeng, G.: Compressible-composable nerf via rank-residual decomposition. Adv. Neural. Inf. Process. Syst. 35, 14798–14809 (2022)
Google Scholar
Tang, J., Ren, J., Zhou, H., Liu, Z., Zeng, G.: DreamGaussian: generative gaussian splatting for efficient 3D content creation. arXiv preprint arXiv:2309.16653 (2023)
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-NeRF: structured view-dependent appearance for neural radiance fields. In: CVPR (2022)
Google Scholar
Wang, L., et al.: Fourier plenoctrees for dynamic radiance field rendering in real-time. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13524–13534 (2022)
Google Scholar
Wu, G., et al.: 4D Gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528 (2023)
Xu, Q., et al.: Point-NeRF: point-based neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448 (2022)
Google Scholar
Yi, T., et al.: GaussianDreamer: fast generation from text to 3D Gaussians by bridging 2D and 3D diffusion models. In: CVPR (2024)
Google Scholar
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: Plenoctrees for real-time rendering of neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5752–5761 (2021)
Google Scholar
Yu, Z., Chen, A., Huang, B., Sattler, T., Geiger, A.: Mip-splatting: alias-free 3D Gaussian splatting (2023)
Google Scholar
Zhao, T., Chen, J., Leng, C., Cheng, J.: TinyNeRF: towards 100 x compression of voxel radiance fields. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 3588–3596 (2023)
Google Scholar
Zhu, H., He, T., Chen, Z.: CMC: few-shot novel view synthesis via cross-view multiplane consistency. arXiv preprint arXiv:2402.16407 (2024)
Zhu, H., He, T., Li, X., Li, B., Chen, Z.: Is vanilla MLP in neural radiance field enough for few-shot view synthesis? arXiv preprint arXiv:2403.06092 (2024)

Download references

Acknowledgements

This work was supported in part by NSFC under Grant 62371434, 62021001.

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, China
Henan Wang, Hanxin Zhu, Runsen Feng & Zhibo Chen
Microsoft Research Asia, Beijing, China
Tianyu He & Jiang Bian
The University of Adelaide, Adelaide, Australia
Jiajun Deng

Authors

Henan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hanxin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Tianyu He
View author publications
You can also search for this author in PubMed Google Scholar
Runsen Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Deng
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Bian
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henan Wang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6897 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H. et al. (2025). End-to-End Rate-Distortion Optimized 3D Gaussian Representation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15116. Springer, Cham. https://doi.org/10.1007/978-3-031-73636-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-73636-0_5
Published: 05 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73635-3
Online ISBN: 978-3-031-73636-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics