Abstract
Pre-training and fine-tuning are important tasks for developing deep-learning models in many modals, e.g., image, video, and text. Recently proposed works enable Neural Radiance Fields (NERF) ready for using with those two tasks. In combination with training and tuning, techniques for enriching data, for example, data augmentation and masking, are also crucial for making deep-learning successfully. Inspired by the field of computer vision (e.g., classification, detection, segmentation, and so on) where the data augmentation have been performed online for each input batch; in NERF, we plan to use computer graphics’ libraries (especially, OpenGL) to generate views online with virtually posed cameras. Practically, OpenGL has a camera frame; however, this kind of cameras is configured differently to the one in computer vision. Specifically, OpenGL’s cameras do not have intrinsic camera matrix as in computer vision. This paper presents an approach to calibrate cameras in the two fields, i.e., computer vision and OpenGL. The developed technique enables to generate views online that can be integrated naturally to NERF’s research works. Utilizing fundamental features of graphics pipelines, e.g., geometric transformation, lighting, shading and texturing, the proposed technique is able to support similar operations in the data augmentation for images, video and text. Experiments in this paper shows that: (a) the intrinsic camera matrix can be loaded into the projection matrix in OpenGL; (b) the intrinsic camera matrix used by the proposed technique can be successfully recovered by COLMAP [1], which means that our calibration method is workable; and (c) the generated views can used with nerfstudio [2] to generate 3D-models, which means that the proposed method can generate views that are compatible with NERF’s algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schonberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Tancik, M., et al.: Nerfstudio: a modular framework for neural radiance field development. In: Proceedings of ACM SIGGRAPH Conference (2023)
Van Nguyen, S., Tran, H.M., Maleszka, M.: Geometric modeling: background for processing the 3D objects. Appl. Intell. 51(8), 6182–6201 (2021)
Nguyen, V.-S., Bac, A., Daniel, M.: Simplification of 3D point clouds sampled from elevation surfaces. In: 21st International Conference on Computer Graphics, Visualization and Computer Vision, WSCG 2013, Plzen, Czech Republic, pp. 60–69, Rank B (2013). ISBN 978-80-86943-75-6
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Proceedings of ECCV (2020)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(60) (2019)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Xie, Z., et al.: Simmim: a simple framework for masked image modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9653–9663 (2022)
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-nerf: structured view-dependent appearance for neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5491–5500 (2022)
Xu, Q., et al.: Point-nerf: point-based neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448 (2022)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Yu, A., Li, R., Tancik, M., Li, H., Ng, R., Kanazawa, A.: Plenoctrees for real-time rendering of neural radiance fields. In: Proceedings of ICCV (2021)
Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: Fastnerf: high-fidelity neural rendering at 200fps. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14346–14355 (2021)
Chen, Z., Funkhouser, T., Hedman, P., Tagliasacchi, A.: Mobilenerf: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Wang, Q., et al.: Ibrnet: learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4690–4699 (2021)
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelnerf: neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4578–4587 (2021)
Wang, P., Chen, X., Chen, T., Venugopalan, S., Wang, Z.: Is attention all that neRF needs? In: The Eleventh International Conference on Learning Representations (2023)
Cong, W., et al.: Enhancing NeRF akin to enhancing LLMs: generalizable NeRF transformer with mixture-of-view-experts. In: Proceedings of ICCV (2023)
Suhail, M., Esteves, C., Sigal, L., Makadia, A.: Generalizable patch-based neural rendering. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 156–174. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_10
Yang, H., et al.: ContraNeRF: generalizable neural radiance fields for synthetic-to-real novel view synthesis via contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16508–16517 (2023)
Chen, J., Yi, W., Ma, L., Jia, X., Lu, H.: GM-NeRF: learning generalizable model-based neural radiance fields from multi-view images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20648–20658 (2023)
Sinh, V.N., et al.: A solution for building a V-museum based on virtual reality application. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 597–609. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_47
Acknowledgments
This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number DS2023-28-01. We acknowledge Ho Chi Minh City University of Technology (HCMUT), VNU-HCM for supporting this study; and we also acknowledge Data Science Laboratory (DsciLab), Ho Chi Minh City University of Technology (HCMUT) for supporting machines in experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Le, S.T., Nguyen, S.V., Tran, M.K., Nguyen, L.D.V. (2024). Graphics and Vision’s Camera Calibration and Applications to Neural Radiance Fields. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2024. Communications in Computer and Information Science, vol 2145. Springer, Singapore. https://doi.org/10.1007/978-981-97-5934-7_11
Download citation
DOI: https://doi.org/10.1007/978-981-97-5934-7_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5933-0
Online ISBN: 978-981-97-5934-7
eBook Packages: Computer ScienceComputer Science (R0)