Abstract
This paper addresses the problem of camera localization, i.e. 6 DoF pose estimation, with respect to a given 3D reconstruction. Current methods often use a coarse-to-fine image registration framework, which integrates image retrieval and visual keypoint matching. However, the localization accuracy is restricted by the limited invariance of feature descriptors. For example, when the query image has been acquired at the illumination (day/night) not consistent with the model image time, or from a position not covered by the model images, retrieval and feature matching may fail, leading to false pose estimation. In this paper, we propose to increase the diversity of model images, namely new viewpoints and new visual appearances, by synthesizing novel images with neural rendering methods. Specifically, we build the 3D model on Neural Radiance Fields (NeRF), and use appearance embeddings to encode variation of illuminations. Then we propose an efficient strategy to interpolate appearance embeddings and place virtual cameras in the scene to generate virtual model images. In order to facilitate the model image management, the appearance embeddings are associated with image acquisition conditions, such as daytime, season, and weather. Query image pose is estimated through similar conditional virtual views using the conventional hierarchical localization framework. We demonstrate the approach by conducting single smartphone image localization in a large-scale 3D urban model, showing the improvement in the accuracy of pose estimation.
This work was completed when Zhenbo Song was an intern at ByteDance. This work was supported in part by the Jiangsu Funding Program for Excellent Postdoctoral Talent under Grant 2022ZB268.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised nerf: fewer views and faster training for free. arXiv preprint arXiv:2107.02791 (2021)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236 (2018)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Hausler, S., Garg, S., Xu, M., Milford, M., Fischer, T.: Patch-netvlad: multi-scale fusion of locally-global descriptors for place recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14141–14152 (2021)
Horaud, R., Conio, B., Leboulleux, O., Lacolle, B.: An analytic solution for the perspective 4-point problem. Comput. Vis. Graph. Image Process. 47(1), 33–44 (1989)
Irschara, A., Zach, C., Frahm, J.M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA (2009)
Liu, L., Li, H., Dai, Y.: Efficient global 2d–3d matching for camera localization in a large-scale 3d map. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Luo, Z., Zhou, L., Bai, X., Chen, H., Zhang, J., Yao, Y., Li, S., Fang, T., Quan, L.: Aslfeat: learning local features of accurate shape and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6589–6598 (2020)
Martin-Brualla, R., Radwan, N., Sajjadi, M.S., Barron, J.T., Dosovitskiy, A., Duckworth, D.: Nerf in the wild: neural radiance fields for unconstrained photo collections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7210–7219 (2021)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Purkait, P., Zhao, C., Zach, C.: Spp-net: deep absolute pose regression with synthetic views. arXiv preprint arXiv:1712.03452 (2017)
Rolin, P., Berger, M.O., Sur, F.: View synthesis for pose computation. Machine Vision and Applications (2–3) (2019)
Sarlin, P.E., Cadena, C., Siegwart, R., Dymczyk, M.: From coarse to fine: robust hierarchical localization at large scale. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12716–12725 (2019)
Sattler, T., Leibe, B., Kobbelt, L.: Improving image-based localization by active correspondence search. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 752–765. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_54
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Torii, A., Arandjelovic, R., Sivic, J., Okutomi, M., Pajdla, T.: 24/7 place recognition by view synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1808–1817 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Song, Z., Sun, X., Xue, Z., Xie, D., Wen, C. (2022). Visual Localization Through Virtual Views. In: Fang, L., Povey, D., Zhai, G., Mei, T., Wang, R. (eds) Artificial Intelligence. CICAI 2022. Lecture Notes in Computer Science(), vol 13606. Springer, Cham. https://doi.org/10.1007/978-3-031-20503-3_52
Download citation
DOI: https://doi.org/10.1007/978-3-031-20503-3_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20502-6
Online ISBN: 978-3-031-20503-3
eBook Packages: Computer ScienceComputer Science (R0)