Fast Generalizable Novel View Synthesis with Uncertainty-Aware Sampling

Mo, Zhixiong; Wu, Weijun; Yu, Weihao; Zhang, Tinghua; Ke, Zhilin; Huang, Jin

doi:10.1007/978-3-031-44213-1_33

Zhixiong Mo¹¹,
Weijun Wu¹¹,
Weihao Yu¹²,
Tinghua Zhang¹³,
Zhilin Ke¹⁴ &
…
Jin Huang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14256))

Included in the following conference series:

International Conference on Artificial Neural Networks

1452 Accesses

Abstract

Recent generalizable NeRF methods synthesize novel view images without optimizing per-scene via constructing radiation fields from 2D features. However, most of the existing methods are slow in the rendering process due to querying millions of 3D points to the NeRF model. In this paper, we propose a photorealistic novel view synthesis method with generalizable and efficient rendering. Specifically, given a set of multi-view images, we utilize a multi-scale scene geometry predictor consisting of MVS and NeRF to infer key points from coarse to fine. In addition, to obtain more accurate key point positions and features, we design an uncertainty-guided sampling strategy based on depth prediction and uncertainty perception. With the key points and scene geometry features, we propose a rendering network to synthesize full-resolution images. This process is fully differentiable, allowing us to train the network with only RGB images. Compared with state-of-the-art baselines, the experimental results show that our model is more efficient and has higher rendering quality on various synthetic and real datasets. With the multi-scale scene geometry predictor and uncertainty-aware sampling strategy, our approach infers geometry information efficiently and improves the rendering speed significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. Int. J. Comput. Vision 120(2), 153–168 (2016)
Article MathSciNet Google Scholar
Chen, A., et al.: MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo. In: CVPR, pp. 14124–14133 (2021)
Google Scholar
Chen, R., Han, S., Xu, J., Su, H.: Point-based multi-view stereo network. In: CVPR, pp. 1538–1547 (2019)
Google Scholar
Cheng, S., et al.: Deep stereo using adaptive thin volume representation with uncertainty awareness. In: CVPR, pp. 2524–2534 (2020)
Google Scholar
Chibane, J., Bansal, A., Lazova, V., Pons-Moll, G.: Stereo Radiance Fields (SRF): learning view synthesis for sparse views of novel scenes. In: CVPR, pp. 7911–7920 (2021)
Google Scholar
De Bonet, J.S., Viola, P.: Poxels: probabilistic voxelized volume reconstruction. In: ICCV, vol. 2 (1999)
Google Scholar
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2009)
Article Google Scholar
Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: FastNeRF: high-fidelity neural rendering at 200 FPS. In: CVPR, pp. 14346–14355 (2021)
Google Scholar
Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: CVPR, pp. 2495–2504 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Im, S., Jeon, H.G., Lin, S., Kweon, I.S., et al.: DPSNet: end-to-end deep plane sweep stereo. In: ICLR (2019)
Google Scholar
Johari, M.M., Lepoittevin, Y., Fleuret, F.: GeoNeRF: generalizing NeRF with geometry priors. In: CVPR, pp. 18365–18375 (2022)
Google Scholar
Liu, Y., et al.: Neural rays for occlusion-aware image-based rendering. In: CVPR, pp. 7824–7833 (2022)
Google Scholar
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: NeRF in the wild: neural radiance fields for unconstrained photo collections. In: CVPR, pp. 7210–7219 (2021)
Google Scholar
Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (TOG) 38(4), 1–14 (2019)
Article Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Pan, X., Lai, Z., Song, S., Huang, G.: ActiveNeRF: learning where to see with uncertainty estimation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13693, pp. 230–246. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_14
Reiser, C., Peng, S., Liao, Y., Geiger, A.: KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs. In: CVPR, pp. 14335–14345 (2021)
Google Scholar
Roessle, B., Barron, J.T., Mildenhall, B., Srinivasan, P.P., Nießner, M.: Dense depth priors for neural radiance fields from sparse input views. In: CVPR, pp. 12892–12901 (2022)
Google Scholar
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Chapter Google Scholar
Wadhwani, K., Kojima, T.: SqueezeNeRF: further factorized FastNeRF for memory-efficient inference. In: CVPR, pp. 2717–2725 (2022)
Google Scholar
Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: CVPR, pp. 4690–4699 (2021)
Google Scholar
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Chapter Google Scholar
Xu, Q., et al.: Point-NeRF: point-based neural radiance fields. In: CVPR, pp. 5438–5448 (2022)
Google Scholar
Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo. In: CVPR, pp. 4877–4886 (2020)
Google Scholar
Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 785–801. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_47
Chapter Google Scholar
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: PlenOctrees for real-time rendering of neural radiance fields. In: CVPR, pp. 5752–5761 (2021)
Google Scholar
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelNeRF: neural radiance fields from one or few images. In: CVPR, pp. 4578–4587 (2021)
Google Scholar
Zhang, J., Yao, Y., Li, S., Luo, Z., Fang, T.: Visibility-aware multi-view stereo network. BMVC (2020)
Google Scholar

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Guangdong Province, China No. 2022A1515010148.

Author information

Authors and Affiliations

South China Normal University, Guangzhou, China
Zhixiong Mo, Weijun Wu & Jin Huang
Research Institute of China Telecom Corporate, Guangzhou, China
Weihao Yu
China Electronic Product Reliability and Environmental Testing, Guangzhou, China
Tinghua Zhang
Guangzhou Pixtalks Information Technology, Guangzhou, China
Zhilin Ke

Authors

Zhixiong Mo
View author publications
You can also search for this author in PubMed Google Scholar
Weijun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Weihao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Tinghua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhilin Ke
View author publications
You can also search for this author in PubMed Google Scholar
Jin Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin Huang .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mo, Z., Wu, W., Yu, W., Zhang, T., Ke, Z., Huang, J. (2023). Fast Generalizable Novel View Synthesis with Uncertainty-Aware Sampling. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14256. Springer, Cham. https://doi.org/10.1007/978-3-031-44213-1_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-44213-1_33
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44212-4
Online ISBN: 978-3-031-44213-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics