High-Resolution and Few-Shot View Synthesis from Asymmetric Dual-Lens Inputs

Xu, Ruikang; Yao, Mingde; Li, Yue; Zhang, Yueyi; Xiong, Zhiwei

doi:10.1007/978-3-031-72646-0_13

Ruikang Xu¹³,
Mingde Yao¹⁴,
Yue Li¹³,
Yueyi Zhang¹³ &
…
Zhiwei Xiong¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15061))

Included in the following conference series:

European Conference on Computer Vision

201 Accesses

Abstract

Novel view synthesis has achieved remarkable quality and efficiency by the paradigm of 3D Gaussian Splatting (3D-GS), but still faces two challenges: 1) significant performance degradation when trained with only few-shot samples due to a lack of geometry constraint, and 2) incapability of rendering at a higher resolution that is beyond the input resolution of training samples. In this paper, we propose Dual-Lens 3D-GS (DL-GS) to achieve high-resolution (HR) and few-shot view synthesis, by leveraging the characteristics of the asymmetric dual-lens system commonly equipped on mobile devices. This kind of system captures the same scene with different focal lengths (i.e., wide-angle and telephoto) under an asymmetric stereo configuration, which naturally provides geometric hints for few-shot training and HR guidance for resolution improvement. Nevertheless, there remain two major technical problems to achieving this goal. First, how to effectively exploit the geometry information from the asymmetric stereo configuration? To this end, we propose a consistency-aware training strategy, which integrates a dual-lens-consistent loss to regularize the 3D-GS optimization. Second, how to make the best use of the dual-lens training samples to effectively improve the resolution of newly synthesized views? To this end, we design a multi-reference-guided refinement module to select proper telephoto and wide-angle guided images from training samples based on the camera pose distances, and then exploit their information for high-frequency detail enhancement. Extensive experiments on simulated and real-captured datasets validate the distinct superiority of our DL-GS over various competitors on the task of HR and few-shot view synthesis. The implementation code is available at https://github.com/XrKang/DL-GS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting

VC-GS: view-consistent deblurring Gaussian splatting via alternating branch optimization

Article 23 January 2025

References

Alzayer, H., et al.: DC2: dual-camera defocus control by learning to refocus. In: CVPR (2023)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: MIP-nerf 360: unbounded anti-aliased neural radiance fields. In: CVPR (2022)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Zip-nerf: anti-aliased grid-based neural radiance fields. In: ICCV (2023)
Google Scholar
Bhat, S.F., Birkl, R., Wofk, D., Wonka, P., Müller, M.: Zoedepth: zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288 (2023)
Bhoi, A.: Monocular depth estimation: a survey. arXiv preprint arXiv:1901.09402 (2019)
Cao, J., et al.: Real-time neural light field on mobile devices. In: CVPR (2023)
Google Scholar
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: the search for essential components in video super-resolution and beyond. In: CVPR (2021)
Google Scholar
Chen, A., et al.: Mvsnerf: fast generalizable radiance field reconstruction from multi-view stereo. In: CVPR (2021)
Google Scholar
Chen, X., Wang, X., Zhou, J., Qiao, Y., Dong, C.: Activating more pixels in image super-resolution transformer. In: CVPR (2023)
Google Scholar
Chen, X., Xiong, Z., Cheng, Z., Peng, J., Zhang, Y., Zha, Z.J.: Degradation-agnostic correspondence from resolution-asymmetric stereo. In: CVPR (2022)
Google Scholar
Chen, Z., Funkhouser, T., Hedman, P., Tagliasacchi, A.: Mobilenerf: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In: CVPR (2023)
Google Scholar
Chung, J., Oh, J., Lee, K.M.: Depth-regularized optimization for 3D gaussian splatting in few-shot images. arXiv preprint arXiv:2311.13398 (2023)
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: fewer views and faster training for free. In: CVPR (2022)
Google Scholar
Deng, N., et al.: FoV-NeRF: foveated neural radiance fields for virtual reality. IEEE Trans. Visual Comput. Graphics 28(11), 3854–3864 (2022)
Article Google Scholar
Dong, J., Fang, Q., Yang, T., Shuai, Q., Qiao, C., Peng, S.: iVS-Net: learning human view synthesis from internet videos. In: ICCV (2023)
Google Scholar
Hattori, H., Maki, A.: Stereo without depth search and metric calibration. In: CVPR (2000)
Google Scholar
Hu, T., Liu, S., Chen, Y., Shen, T., Jia, J.: Efficientnerf efficient neural radiance fields. In: CVPR (2022)
Google Scholar
Huang, X., Li, W., Hu, J., Chen, H., Wang, Y.: RefSR-NeRF: towards high fidelity and super resolution view synthesis. In: CVPR (2023)
Google Scholar
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (2023)
Google Scholar
Kim, M., Seo, S., Han, B.: Infonerf: ray entropy minimization for few-shot neural volume rendering. In: CVPR (2022)
Google Scholar
Lai, W.S., Huang, J.B., Wang, O., Shechtman, E., Yumer, E., Yang, M.H.: Learning blind video temporal consistency. In: ECCV (2018)
Google Scholar
Larsson, V., Zobernig, N., Taskin, K., Pollefeys, M.: Calibration-free structure-from-motion with calibrated radial trifocal tensors. In: ECCV (2020)
Google Scholar
Lee, J., Lee, M., Cho, S., Lee, S.: Reference-based video super-resolution using multi-camera video triplets. In: CVPR (2022)
Google Scholar
Li, Q., Li, F., Guo, J., Guo, Y.: Uhdnerf: ultra-high-definition neural radiance fields. In: ICCV (2023)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: CVPRW (2021)
Google Scholar
Lin, C.Y., Fu, Q., Merth, T., Yang, K., Ranjan, A.: Fastsr-nerf: improving nerf efficiency on consumer devices with a simple super-resolution pipeline. In: WACV (2024)
Google Scholar
Manne, S.K.R., Prasad, B., Rosh, K.: Asymmetric wide tele camera fusion for high fidelity digital zoom. In: ICCVIP (2019)
Google Scholar
Mechrez, R., Talmi, I., Shama, F., Zelnik-Manor, L.: Maintaining natural image statistics with the contextual loss. In: ACCV (2019)
Google Scholar
Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transformation with non-aligned data. In: ECCV (2018)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Mohan, M.M., Nithin, G., Rajagopalan, A.: Deep dynamic scene deblurring for unconstrained dual-lens cameras. IEEE Trans. Image Process. 30, 4479–4491 (2021)
Article Google Scholar
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. (ToG) 41(4), 1–15 (2022)
Article Google Scholar
Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: Regnerf: regularizing neural radiance fields for view synthesis from sparse inputs. In: CVPR (2022)
Google Scholar
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: ICCV (2021)
Google Scholar
Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In: CVPR (2020)
Google Scholar
Santesteban, I., Otaduy, M., Thuerey, N., Casas, D.: Ulnef: untangled layered neural fields for mix-and-match virtual try-on. In: NIPS (2022)
Google Scholar
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: CVPR (2016)
Google Scholar
Sedgwick, P.: Pearson’s correlation coefficient. BMJ 345 (2012)
Google Scholar
Seo, S., Chang, Y., Kwak, N.: Flipnerf: flipped reflection rays for few-shot novel view synthesis. In: ICCV (2023)
Google Scholar
Shao, R., et al.: Doublefield: bridging the neural surface and radiance fields for high-fidelity human reconstruction and rendering. In: CVPR (2022)
Google Scholar
Somraj, N., Soundararajan, R.: VIP-nerf: visibility prior for sparse input neural radiance fields. In: ACM SIGGRAPH (2023)
Google Scholar
Song, J., et al.: Därf: boosting radiance fields from sparse input views with monocular depth adaptation. In: NIPS (2023)
Google Scholar
Song, T., Kim, S., Sohn, K.: Unsupervised deep asymmetric stereo matching with spatially-adaptive self-similarity. In: CVPR (2023)
Google Scholar
Tang, J., Ren, J., Zhou, H., Liu, Z., Zeng, G.: Dreamgaussian: generative gaussian splatting for efficient 3D content creation. arXiv preprint arXiv:2309.16653 (2023)
Teed, Z., Deng, J.: Raft: recurrent all-pairs field transforms for optical flow. In: ECCV (2020)
Google Scholar
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: CVPR (2020)
Google Scholar
Tosi, F., Tonioni, A., De Gregorio, D., Poggi, M.: Nerf-supervised deep stereo. In: CVPR (2023)
Google Scholar
Uy, M.A., Martin-Brualla, R., Guibas, L., Li, K.: Scade: nerfs from space carving with ambiguity-aware depth estimates. In: CVPR (2023)
Google Scholar
Wang, C., Wu, X., Guo, Y.C., Zhang, S.H., Tai, Y.W., Hu, S.M.: Nerf-SR: high quality neural radiance fields using supersampling. In: ACM MM (2022)
Google Scholar
Wang, G., Chen, Z., Loy, C.C., Liu, Z.: Sparsenerf: distilling depth ranking for few-shot novel view synthesis. In: ICCV (2023)
Google Scholar
Wang, T., Xie, J., Sun, W., Yan, Q., Chen, Q.: Dual-camera super-resolution with aligned attention modules. In: ICCV (2021)
Google Scholar
Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: CVPRW (2019)
Google Scholar
Wang, Y., et al.: Disentangling light fields for super-resolution and disparity estimation. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 425–443 (2022)
Article Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1398–1402. IEEE (2003)
Google Scholar
Xiong, H., Muttukuru, S., Upadhyay, R., Chari, P., Kadambi, A.: Sparsegs: real-time 360 sparse view synthesis using gaussian splatting. arXiv preprint arXiv:2312.00206 (2023)
Xu, R., Yao, M., Xiong, Z.: Zero-shot dual-lens super-resolution. In: CVPR (2023)
Google Scholar
Yang, J., Pavone, M., Wang, Y.: Freenerf: improving few-shot neural rendering with free frequency regularization. In: CVPR (2023)
Google Scholar
Yoon, Y., Yoon, K.J.: Cross-guided optimization of radiance fields with multi-view image super-resolution for high-resolution novel view synthesis. In: CVPR (2023)
Google Scholar
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelNeRF: neural radiance fields from one or few images. In: CVPR (2021)
Google Scholar
Yue, H., Cui, Z., Li, K., Yang, J.: Kedusr: real-world dual-lens super-resolution via kernel-free matching. arXiv preprint arXiv:2312.17050 (2023)
Zhang, J., et al.: Mobidepth: real-time depth estimation using on-device dual cameras. In: Proceedings of the 28th Annual International Conference on Mobile Computing and Networking, pp. 528–541 (2022)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhang, Z., Wang, R., Zhang, H., Chen, Y., Zuo, W.: Self-supervised learning for real-world super-resolution from dual zoomed observations. In: ECCV (2022)
Google Scholar
Zhou, K., Liu, Z., Qiao, Y., Xiang, T., Loy, C.C.: Domain generalization: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4396–4415 (2022)
Google Scholar
Zhu, Z., Fan, Z., Jiang, Y., Wang, Z.: FSGS: real-time few-shot view synthesis using gaussian splatting. arXiv preprint arXiv:2312.00451 (2023)

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 62131003 and 62021001.

Author information

Authors and Affiliations

MoE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, China
Ruikang Xu, Yue Li, Yueyi Zhang & Zhiwei Xiong
The Chinese University of Hong Kong, Hong Kong SAR, China
Mingde Yao

Authors

Ruikang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Mingde Yao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Li
View author publications
You can also search for this author in PubMed Google Scholar
Yueyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiwei Xiong .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 9172 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, R., Yao, M., Li, Y., Zhang, Y., Xiong, Z. (2025). High-Resolution and Few-Shot View Synthesis from Asymmetric Dual-Lens Inputs. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15061. Springer, Cham. https://doi.org/10.1007/978-3-031-72646-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-72646-0_13
Published: 28 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72645-3
Online ISBN: 978-3-031-72646-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

High-Resolution and Few-Shot View Synthesis from Asymmetric Dual-Lens Inputs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting

VC-GS: view-consistent deblurring Gaussian splatting via alternating branch optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 9172 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

High-Resolution and Few-Shot View Synthesis from Asymmetric Dual-Lens Inputs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting

VC-GS: view-consistent deblurring Gaussian splatting via alternating branch optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 9172 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation