Disentangled Generation and Aggregation for Robust Radiance Fields

Shen, Shihe; Gao, Huachen; Xu, Wangze; Peng, Rui; Tang, Luyang; Xiong, Kaiqiang; Jiao, Jianbo; Wang, Ronggang

doi:10.1007/978-3-031-72967-6_13

Shihe Shen¹³,
Huachen Gao¹³,
Wangze Xu¹³,
Rui Peng^13,14,
Luyang Tang^13,14,
Kaiqiang Xiong^13,14,
Jianbo Jiao¹⁵ &
…
Ronggang Wang^13,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15107))

Included in the following conference series:

European Conference on Computer Vision

213 Accesses

Abstract

The utilization of the triplane-based radiance fields has gained attention in recent years due to its ability to effectively disentangle 3D scenes with a high-quality representation and low computation cost. A key requirement of this method is the precise input of camera poses. However, due to the local update property of the triplane, a similar joint estimation as previous joint pose-NeRF optimization works easily results in local minima. To this end, we propose the Disentangled Triplane Generation module to introduce global feature context and smoothness into triplane learning, which mitigates errors caused by local updating. Then, we propose the Disentangled Plane Aggregation to mitigate the entanglement caused by the common triplane feature aggregation during camera pose updating. In addition, we introduce a two-stage warm-start training strategy to reduce the implicit constraints caused by the triplane generator. Quantitative and qualitative results demonstrate that our proposed method achieves state-of-the-art performance in novel view synthesis with noisy or unknown camera poses, as well as efficient convergence of optimization. Project page: https://gaohchen.github.io/DiGARR/.

S. Shen and H. Gao—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks

PIDSNeRF: pose interpolation depth supervision neural radiance fields for view synthesis from challenging input

Article 06 August 2024

GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time

References

Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-nerf: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-nerf 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5470–5479 (2022)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Zip-nerf: anti-aliased grid-based neural radiance fields. arXiv preprint arXiv:2304.06706 (2023)
Bian, W., Wang, Z., Li, K., Bian, J.W., Prisacariu, V.A.: Nope-nerf: optimising neural radiance field with no pose prior. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4160–4169 (2023)
Google Scholar
Cao, A., Johnson, J.: Hexplane: a fast representation for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 130–141 (2023)
Google Scholar
Chan, E.R., et al.: Efficient geometry-aware 3D generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16123–16133 (2022)
Google Scholar
Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: Tensorf: tensorial radiance fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 333–350. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20
Chapter Google Scholar
Chen, H., et al.: Single-stage diffusion nerf: a unified approach to 3D generation and reconstruction. arXiv preprint arXiv:2304.06714 (2023)
Chen, Y., et al.: Local-to-global registration for bundle-adjusting neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8264–8273 (2023)
Google Scholar
Cheng, Z., Esteves, C., Jampani, V., Kar, A., Maji, S., Makadia, A.: Lu-nerf: scene and pose estimation by synchronizing local unposed nerfs. arXiv preprint arXiv:2306.05410 (2023)
Chng, S.F., Ramasinghe, S., Sherrah, J., Lucey, S.: Gaussian activated neural radiance fields for high fidelity reconstruction and pose estimation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13693, pp. 264–280. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_16
Chapter Google Scholar
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: fewer views and faster training for free. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12882–12891 (2022)
Google Scholar
Fridovich-Keil, S., Meanti, G., Warburg, F.R., Recht, B., Kanazawa, A.: K-planes: explicit radiance fields in space, time, and appearance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12479–12488 (2023)
Google Scholar
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)
Google Scholar
Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A.A., Wang, X.: Colmap-free 3D gaussian splatting. arXiv preprint arXiv:2312.07504 (2023). https://doi.org/10.48550/arXiv.2312.07504
Gao, H., Liu, X., Qu, M., Huang, S.: PDANet: self-supervised monocular depth estimation using perceptual and data augmentation consistency. Appl. Sci. 11(12), 5383 (2021)
Article Google Scholar
Gao, H., et al.: FDC-NeRF: learning pose-free neural radiance fields with flow-depth consistency. In: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3615–3619 (2024)
Google Scholar
Heo, H., et al.: Robust camera pose refinement for multi-resolution hash encoding. In: Proceedings of the 40th International Conference on Machine Learning (2023)
Google Scholar
Hong, Y., et al.: LRM: large reconstruction model for single image to 3D. arXiv preprint arXiv:2311.04400 (2023)
Hu, W., et al.: Tri-miprf: tri-mip representation for efficient anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19774–19783 (2023)
Google Scholar
Jeong, Y., Ahn, S., Choy, C., Anandkumar, A., Cho, M., Park, J.: Self-calibrating neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5846–5854 (2021)
Google Scholar
Kajiya, J.T., Von Herzen, B.P.: Ray tracing volume densities. ACM SIGGRAPH Comput. Graph. 18(3), 165–174 (1984)
Article Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Google Scholar
Kim, M., Seo, S., Han, B.: Infonerf: ray entropy minimization for few-shot neural volume rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12912–12921 (2022)
Google Scholar
Levy, A., Matthews, M., Sela, M., Wetzstein, G., Lagun, D.: Melon: nerf with unposed images using equivalence class estimation. arXiv preprint arXiv:2303.08096 (2023)
Li, H., et al.: GGRT: towards generalizable 3D gaussians without pose priors in real-time. arXiv preprint arXiv:2403.10147 (2024)
Li, T., et al.: Neural 3D video synthesis from multi-view video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5521–5531 (2022)
Google Scholar
Lin, C.H., Ma, W.C., Torralba, A., Lucey, S.: Barf: bundle-adjusting neural radiance fields. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5721–5731. IEEE Computer Society (2021)
Google Scholar
Liu, L., Gu, J., Zaw Lin, K., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. Adv. Neural. Inf. Process. Syst. 33, 15651–15663 (2020)
Google Scholar
Liu, R., Wu, R., Van Hoorick, B., Tokmakov, P., Zakharov, S., Vondrick, C.: Zero-1-to-3: zero-shot one image to 3D object. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9298–9309 (2023)
Google Scholar
Liu, Y.L., et al.: Robust dynamic radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13–23 (2023)
Google Scholar
Max, N.: Optical models for direct volume rendering. IEEE Trans. Visual Comput. Graphics 1(2), 99–108 (1995)
Article Google Scholar
Meng, Q., et al.: GNeRF: GAN-based neural radiance field without posed camera. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6351–6361 (2021)
Google Scholar
Meuleman, A., et al.: Progressively optimized local radiance fields for robust view synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16539–16548 (2023)
Google Scholar
Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (TOG) 38(4), 1–14 (2019)
Article Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Article Google Scholar
Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: Regnerf: regularizing neural radiance fields for view synthesis from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5480–5490 (2022)
Google Scholar
Oechsle, M., Peng, S., Geiger, A.: Unisurf: unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5589–5599 (2021)
Google Scholar
Oquab, M., et al.: Dinov2: learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)
Park, K., et al.: Nerfies: deformable neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5865–5874 (2021)
Google Scholar
Park, K., et al.: Hypernerf: a higher-dimensional representation for topologically varying neural radiance fields. arXiv preprint arXiv:2106.13228 (2021)
Peng, R., Gu, X., Tang, L., Shen, S., Yu, F., Wang, R.: Gens: generalizable neural surface reconstruction from multi-view images. Adv. Neural. Inf. Process. Syst. 36, 56932–56945 (2023)
Google Scholar
Peng, R., Wang, R., Wang, Z., Lai, Y., Wang, R.: Rethinking depth estimation for multi-view stereo: a unified representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Google Scholar
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: neural radiance fields for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10318–10327 (2021)
Google Scholar
Roessle, B., Barron, J.T., Mildenhall, B., Srinivasan, P.P., Nießner, M.: Dense depth priors for neural radiance fields from sparse input views. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12892–12901 (2022)
Google Scholar
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Google Scholar
Somraj, N., Karanayil, A., Soundararajan, R.: Simplenerf: regularizing sparse input neural radiance fields with simpler solutions. In: SIGGRAPH Asia 2023 Conference Papers, pp. 1–11 (2023)
Google Scholar
Song, J., et al.: Därf: boosting radiance fields from sparse input views with monocular depth adaptation. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469 (2022)
Google Scholar
Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: Sparf: neural radiance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4190–4200 (2023)
Google Scholar
Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: Neus: learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021)
Wang, P., et al.: PF-LRM: pose-free large reconstruction model for joint pose and shape prediction. arXiv preprint arXiv:2311.12024 (2023)
Wang, Z., Wu, S., Xie, W., Chen, M., Prisacariu, V.A.: NeRF$--$: neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)
Wei, Y., Liu, S., Rao, Y., Zhao, W., Lu, J., Zhou, J.: Nerfingmvs: guided optimization of neural radiance fields for indoor multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5610–5619 (2021)
Google Scholar
Xia, Y., Tang, H., Timofte, R., Gool, L.V.: Sinerf: sinusoidal neural radiance fields for joint pose estimation and scene reconstruction. In: 33rd British Machine Vision Conference (2022)
Google Scholar
Xu, D., Jiang, Y., Wang, P., Fan, Z., Shi, H., Wang, Z.: Sinnerf: training neural radiance fields on complex scenes from a single image. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) European Conference on Computer Vision. ECCV 2022, vol. 13682, pp. 736–753. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_42
Chapter Google Scholar
Yen-Chen, L., Florence, P., Barron, J.T., Rodriguez, A., Isola, P., Lin, T.Y.: iNeRF: inverting neural radiance fields for pose estimation. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1323–1330. IEEE (2021)
Google Scholar
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: Plenoctrees for real-time rendering of neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5752–5761 (2021)
Google Scholar
Zhang, J., et al.: VMRF: view matching neural radiance fields. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 6579–6587 (2022)
Google Scholar
Zhang, K., Riegler, G., Snavely, N., Koltun, V.: Nerf++: analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492 (2020)
Zou, Z.X., et al.: Triplane meets gaussian splatting: fast and generalizable single-view 3D reconstruction with transformers. arXiv preprint arXiv:2312.09147 (2023)

Download references

Acknowledgments

This work is financially supported by Outstanding Talents Training Fund in Shenzhen, Shenzhen Science and Technology Program-Shenzhen Cultivation of Excellent Scientific and Technological Innovation Talents project (Grant No. RCJC20200714114435057), Shenzhen Science and Technology Program-Shenzhen Hong Kong joint funding project (Grant No. SGDX20211123144400001), Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, National Natural Science Foundation of China U21B2012, R24115SG MIGU-PKU META VISION TECHNOLOGY INNOVATION LAB. Jianbo Jiao is supported by the Royal Society Short Industry Fellowship (SIF$\setminus $R1$\setminus $231009).

Author information

Authors and Affiliations

School of Electronic and Computer Engineering, Peking University, Beijing, China
Shihe Shen, Huachen Gao, Wangze Xu, Rui Peng, Luyang Tang, Kaiqiang Xiong & Ronggang Wang
Peng Cheng Laboratory, Shenzhen, China
Rui Peng, Luyang Tang, Kaiqiang Xiong & Ronggang Wang
School of Computer Science, University of Birmingham, Birmingham, UK
Jianbo Jiao

Authors

Shihe Shen
View author publications
You can also search for this author in PubMed Google Scholar
Huachen Gao
View author publications
You can also search for this author in PubMed Google Scholar
Wangze Xu
View author publications
You can also search for this author in PubMed Google Scholar
Rui Peng
View author publications
You can also search for this author in PubMed Google Scholar
Luyang Tang
View author publications
You can also search for this author in PubMed Google Scholar
Kaiqiang Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Jianbo Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Ronggang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ronggang Wang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 11953 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, S. et al. (2025). Disentangled Generation and Aggregation for Robust Radiance Fields. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15107. Springer, Cham. https://doi.org/10.1007/978-3-031-72967-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-72967-6_13
Published: 03 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72966-9
Online ISBN: 978-3-031-72967-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Disentangled Generation and Aggregation for Robust Radiance Fields