Abstract
3D Gaussian Splatting (3DGS) has become an emerging technique with remarkable potential in 3D representation and image rendering. However, the substantial storage overhead of 3DGS significantly impedes its practical applications. In this work, we formulate the compact 3D Gaussian learning as an end-to-end Rate-Distortion Optimization (RDO) problem and propose RDO-Gaussian that can achieve flexible and continuous rate control. RDO-Gaussian addresses two main issues that exist in current schemes: 1) Different from prior endeavors that minimize the rate under the fixed distortion, we introduce dynamic pruning and entropy-constrained vector quantization (ECVQ) that optimize the rate and distortion at the same time. 2) Previous works treat the colors of each Gaussian equally, while we model the colors of different regions and materials with learnable numbers of parameters. We verify our method on both real and synthetic scenes, showcasing that RDO-Gaussian greatly reduces the size of 3D Gaussian over 40\(\times \), and surpasses existing methods in rate-distortion performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimization of nonlinear transform codes for perceptual quality. In: 2016 Picture Coding Symposium (PCS), pp. 1–5. IEEE (2016)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (2017)
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: ICCV (2021)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: Tensorf: tensorial radiance fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 333–350. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20
Chou, P.A., Lookabaugh, T., Gray, R.M.: Entropy-constrained vector quantization. IEEE Trans. Acoust. Speech Signal Process. 37(1), 31–42 (1989)
Deng, C.L., Tartaglione, E.: Compressing explicit voxel grid representations: fast nerfs become also small. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1236–1245 (2023)
Duan, Y., Wei, F., Dai, Q., He, Y., Chen, W., Chen, B.: 4D gaussian splatting: towards efficient novel view synthesis for dynamic scenes. arXiv preprint arXiv:2402.03307 (2024)
Fan, Z., Wang, K., Wen, K., Zhu, Z., Xu, D., Wang, Z.: LightGaussian: Unbounded 3D gaussian compression with 15x reduction and 200+ FPS. arXiv preprint arXiv:2311.17245 (2023)
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression, vol. 159. Springer, Heidelberg (2012)
Girish, S., Shrivastava, A., Gupta, K.: SHACIRA: scalable HAsh-grid compression for implicit neural representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 17513–17524 (2023)
Graziosi, D., Nakagami, O., Kuma, S., Zaghetto, A., Suzuki, T., Tabatabai, A.: An overview of ongoing point cloud compression standardization activities: video-based (V-PCC) and geometry-based (G-PCC). APSIPA Trans. Signal Inf. Process. 9, e13 (2020)
Hedman, P., Philip, J., Price, T., Frahm, J.M., Drettakis, G., Brostow, G.: Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (ToG) 37(6), 1–15 (2018)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. (ToG) 36(4), 1–13 (2017)
Lee, J.C., Rho, D., Sun, X., Ko, J.H., Park, E.: Compact 3D gaussian representation for radiance field. arXiv preprint arXiv:2311.13681 (2023)
Li, L., Shen, Z., Wang, Z., Shen, L., Bo, L.: Compressing volumetric radiance fields to 1 MB. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4222–4231 (2023)
Liu, X., et al.: HumanGaussian: text-driven 3D human generation with gaussian splatting. arXiv preprint arXiv:2311.17061 (2023)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Morgenstern, W., Barthel, F., Hilsmann, A., Eisert, P.: Compact 3D scene representation via self-organizing Gaussian grids. arXiv preprint arXiv:2312.13299 (2023)
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Navaneet, K., Meibodi, K.P., Koohpayegani, S.A., Pirsiavash, H.: Compact3D: compressing gaussian splat radiance field models with vector quantization. arXiv preprint arXiv:2311.18159 (2023)
Peng, S., Jiang, C., Liao, Y., Niemeyer, M., Pollefeys, M., Geiger, A.: Shape as points: a differentiable Poisson solver. Adv. Neural. Inf. Process. Syst. 34, 13032–13044 (2021)
Reiser, C., et al.: MeRF: memory-efficient radiance fields for real-time view synthesis in unbounded scenes. ACM Trans. Graph. (TOG) 42(4), 1–12 (2023)
Rho, D., Lee, B., Nam, S., Lee, J.C., Ko, J.H., Park, E.: Masked wavelet representation for compact neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20680–20690 (2023)
Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469 (2022)
Takikawa, T., et al.: Variable bitrate neural fields. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–9 (2022)
Tang, J., Chen, X., Wang, J., Zeng, G.: Compressible-composable nerf via rank-residual decomposition. Adv. Neural. Inf. Process. Syst. 35, 14798–14809 (2022)
Tang, J., Ren, J., Zhou, H., Liu, Z., Zeng, G.: DreamGaussian: generative gaussian splatting for efficient 3D content creation. arXiv preprint arXiv:2309.16653 (2023)
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-NeRF: structured view-dependent appearance for neural radiance fields. In: CVPR (2022)
Wang, L., et al.: Fourier plenoctrees for dynamic radiance field rendering in real-time. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13524–13534 (2022)
Wu, G., et al.: 4D Gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528 (2023)
Xu, Q., et al.: Point-NeRF: point-based neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448 (2022)
Yi, T., et al.: GaussianDreamer: fast generation from text to 3D Gaussians by bridging 2D and 3D diffusion models. In: CVPR (2024)
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: Plenoctrees for real-time rendering of neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5752–5761 (2021)
Yu, Z., Chen, A., Huang, B., Sattler, T., Geiger, A.: Mip-splatting: alias-free 3D Gaussian splatting (2023)
Zhao, T., Chen, J., Leng, C., Cheng, J.: TinyNeRF: towards 100 x compression of voxel radiance fields. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 3588–3596 (2023)
Zhu, H., He, T., Chen, Z.: CMC: few-shot novel view synthesis via cross-view multiplane consistency. arXiv preprint arXiv:2402.16407 (2024)
Zhu, H., He, T., Li, X., Li, B., Chen, Z.: Is vanilla MLP in neural radiance field enough for few-shot view synthesis? arXiv preprint arXiv:2403.06092 (2024)
Acknowledgements
This work was supported in part by NSFC under Grant 62371434, 62021001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, H. et al. (2025). End-to-End Rate-Distortion Optimized 3D Gaussian Representation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15116. Springer, Cham. https://doi.org/10.1007/978-3-031-73636-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-73636-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73635-3
Online ISBN: 978-3-031-73636-0
eBook Packages: Computer ScienceComputer Science (R0)