Direct Alignment for Robust NeRF Learning

Garg, Ravi; Chng, Shin-Fang; Lucey, Simon

doi:10.1007/978-981-96-0969-7_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15480))

Included in the following conference series:

Asian Conference on Computer Vision

161 Accesses

Abstract

Differentiable volume rendering has evolved to be the prevalent optimization technique for creating implicit and explicit maps. Numerous efforts have explored the role of camera pose optimization and non-rigid tracking within Neural Radiance Fields (NeRFs). However, the relation between differentiable volume rendering and classical multi-view geometry remains under explored. In this work, we investigate the role of direct alignment in radiance field estimation by incorporating a simple but effective loss while training NeRFs. Armed with good practices for direct alignment while leveraging the effectiveness of volumetric representation in occlusion handling, our proposed framework is able to reconstruct real scenes from sparse or dense views at a much higher accuracy. We show despite relying on the photometric consistency, incorporating direct alignment improves view synthesis accuracy of NeRFs by $12\%$ with known poses on LLFF dataset whereas joint optimization of pose and radiance field gets a boost in view synthesis accuracy of over $18\%$ with rotation and translation errors going down by $64\%$ and $57\%$ respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constraints

PIDSNeRF: pose interpolation depth supervision neural radiance fields for view synthesis from challenging input

Article 06 August 2024

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis

Notes

1.
Notice that the multi-modality of ray terminations are important to model non-opaque surfaces.
2.
This enables [4] to estimate large baseline motions. [4] advocate annealing the alignment loss over the course of training to avoid baking in errors in single view depth predictions and can be seen as alternative initialization scheme to pose via depth guided image alignment loss instead of using COLMAP.
3.
Note that as $\mathbf {u_s}$ is the projection of the point that is intersection of the two rays by definition it lies on the ray corresponding to pixel $\mathbf {u_s}$.
4.
We see small boast in performance by linearly decreasing the weight of alignment loss to zero and the results in Table 2 uses this scheduler. Ablation for scheduling can be find in supplementary material. Please note that all other reported results and visualization (except Table 2) use no scheduling.

References

Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.: Building rome in a day. Commun. ACM 54, 105–112 (2011)
Article Google Scholar
Baker, S., Matthews, I.: Lucas-kanade 20 years on: A unifying framework part 1: The quantity approximated, the warp update rule, and the gradient descent approximation. International Journal of Computer Vision - IJCV (01 2004)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5460–5469 (2022). https://doi.org/10.1109/CVPR52688.2022.00539
Bian, W., Wang, Z., Li, K., Bian, J.W., Prisacariu, V.A.: Nope-nerf: Optimising neural radiance field with no pose prior. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4160–4169 (2023)
Google Scholar
Cai, H., Feng, W., Feng, X., Wang, Y., Zhang, J.: Neural surface reconstruction of dynamic scenes with monocular rgb-d camera. Adv. Neural. Inf. Process. Syst. 35, 967–981 (2022)
Google Scholar
Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 14124–14133 (2021)
Google Scholar
Chng, S.F., Garg, R., Saratchandran, H., Lucey, S.: Invertible neural warp for nerf. arXiv preprint arXiv:2407.12354 (2024)
Chng, S.F., Ramasinghe, S., Sherrah, J., Lucey, S.: Gaussian activated neural radiance fields for high fidelity reconstruction and pose estimation. In: European Conference on Computer Vision. pp. 264–280. Springer (2022)
Google Scholar
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: Fewer views and faster training for free. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 12882–12891 (2022)
Google Scholar
Gao, Z., Dai, W., Zhang, Y.: Adaptive positional encoding for bundle-adjusting neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3284–3294 (2023)
Google Scholar
Jain, A., Tancik, M., Abbeel, P.: Putting nerf on a diet: Semantically consistent few-shot view synthesis (2021)
Google Scholar
Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanæs, H.: Large scale multi-view stereopsis evaluation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. pp. 406–413. IEEE (2014)
Google Scholar
Jeong, Y., Ahn, S., Choy, C., Anandkumar, A., Cho, M., Park, J.: Self-calibrating neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5846–5854 (2021)
Google Scholar
Kim, M., Seo, S., Han, B.: Infonerf: Ray entropy minimization for few-shot neural volume rendering. In: CVPR (2022)
Google Scholar
Li, Z., Niklaus, S., Snavely, N., Wang, O.: Neural scene flow fields for space-time view synthesis of dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Lin, C.H., Ma, W.C., Torralba, A., Lucey, S.: Barf: Bundle-adjusting neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5741–5751 (2021)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Ortiz-Cayon, R., Kalantari, N.K., Ramamoorthi, R., Ng, R., Kar, A.: Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG) 38(4), 1–14 (2019)
Article Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015). https://doi.org/10.1109/TRO.2015.2463671
Article Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: Dense tracking and mapping in real-time. In: 2011 international conference on computer vision. pp. 2320–2327. IEEE (2011)
Google Scholar
Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5480–5490 (2022)
Google Scholar
Park, K., Henzler, P., Mildenhall, B., Barron, J.T., Martin-Brualla, R.: Camp: Camera preconditioning for neural radiance fields. ACM Transactions on Graphics (TOG) 42(6), 1–11 (2023)
Article Google Scholar
Park, K., Sinha, U., Barron, J.T., Bouaziz, S., Goldman, D.B., Seitz, S.M., Martin-Brualla, R.: Nerfies: Deformable neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5865–5874 (2021)
Google Scholar
Roessle, B., Barron, J.T., Mildenhall, B., Srinivasan, P.P., Nießner, M.: Dense depth priors for neural radiance fields from sparse input views. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2022)
Google Scholar
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Schönberger, J.L., Zheng, E., Pollefeys, M., Frahm, J.M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision (ECCV) (2016)
Google Scholar
Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: Sparf: Neural radiance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4190–4200 (2023)
Google Scholar
Wang, C., MacDonald, L.E., Jeni, L.A., Lucey, S.: Flow supervision for deformable nerf. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 21128–21137 (2023)
Google Scholar
Wang, C., MacDonald, L.E., Jeni, L.A., Lucey, S.: Flow supervision for deformable nerf. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 21128–21137 (June 2023)
Google Scholar
Wang, Q., Chang, Y.Y., Cai, R., Li, Z., Hariharan, B., Holynski, A., Snavely, N.: Tracking everything everywhere all at once. arXiv preprint arXiv:2306.05422 (2023)
Wang, Z., Wu, S., Xie, W., Chen, M., Prisacariu, V.A.: Nerf–: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)
Xia, Y., Tang, H., Timofte, R., Van Gool, L.: Sinerf: Sinusoidal neural radiance fields for joint pose estimation and scene reconstruction. arXiv preprint arXiv:2210.04553 (2022)
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelNeRF: Neural radiance fields from one or few images. In: CVPR (2021)
Google Scholar
Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. Advances in Neural Information Processing Systems (NeurIPS) (2022)
Google Scholar
Zhan, H., Zheng, J., Xu, Y., Reid, I., Rezatofighi, H.: Activermap: Radiance field for active mapping and planning. arXiv preprint arXiv:2211.12656 (2022)
Zhao, F., Yang, W., Zhang, J., Lin, P., Zhang, Y., Yu, J., Xu, L.: Humannerf: Efficiently generated human radiance field from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7743–7753 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Adelaide University, Adelaide, Australia
Ravi Garg, Shin-Fang Chng & Simon Lucey
Australian Institute of Machine Learning, Adelaide, Australia
Ravi Garg, Shin-Fang Chng & Simon Lucey

Authors

Ravi Garg
View author publications
You can also search for this author in PubMed Google Scholar
Shin-Fang Chng
View author publications
You can also search for this author in PubMed Google Scholar
Simon Lucey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ravi Garg .

Editor information

Editors and Affiliations

Pohang University of Science and Technology (POSTECH), Pohang, Korea (Republic of)
Minsu Cho
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Ivan Laptev
Google, Mountain View, CA, USA
Du Tran
National University of Singapore, Singapore, Singapore
Angela Yao
Peking University, Beijing, China
Hongbin Zha

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 259 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garg, R., Chng, SF., Lucey, S. (2025). Direct Alignment for Robust NeRF Learning. In: Cho, M., Laptev, I., Tran, D., Yao, A., Zha, H. (eds) Computer Vision – ACCV 2024. ACCV 2024. Lecture Notes in Computer Science, vol 15480. Springer, Singapore. https://doi.org/10.1007/978-981-96-0969-7_6

Download citation

DOI: https://doi.org/10.1007/978-981-96-0969-7_6
Published: 08 December 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0968-0
Online ISBN: 978-981-96-0969-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Direct Alignment for Robust NeRF Learning