DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Chen, Shuai; Li, Xinghui; Wang, Zirui; Prisacariu, Victor A.

doi:10.1007/978-3-031-20080-9_1

Shuai Chen¹²,
Xinghui Li¹²,
Zirui Wang¹² &
…
Victor A. Prisacariu¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13670))

Included in the following conference series:

European Conference on Computer Vision

3034 Accesses

Abstract

We introduce a camera relocalization pipeline that combines absolute pose regression (APR) and direct feature matching. By incorporating exposure-adaptive novel view synthesis, our method successfully addresses photometric distortions in outdoor environments that existing photometric-based methods fail to handle. With domain-invariant feature matching, our solution improves pose regression accuracy using semi-supervised learning on unlabeled data. In particular, the pipeline consists of two components: Novel View Synthesizer and DFNet. The former synthesizes novel views compensating for changes in exposure and the latter regresses camera poses and extracts robust features that close the domain gap between real images and synthetic ones. Furthermore, we introduce an online synthetic data generation scheme. We show that these approaches effectively enhance camera pose estimation both in indoor and outdoor scenes. Hence, our method achieves a state-of-the-art accuracy by outperforming existing single-image APR methods by as much as 56%, comparable to 3D structure-based methods. (The code is available in https://code.active.vision.)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Few-Shot NeRF-Based View Synthesis for Viewpoint-Biased Camera Pose Estimation

MegaScenes: Scene-Level View Synthesis at Scale

GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time

References

Brachmann, E., et al.: DSAC - Differentiable RANSAC for Camera Localization. In: CVPR (2017)
Google Scholar
Brachmann, E., Rother, C.: Learning less is more - 6D camera localization via 3D surface regression. In: CVPR (2018)
Google Scholar
Brachmann, E., Rother, C.: Visual camera re-localization from RGB and RGB-D images using DSAC. arXiv (2020)
Google Scholar
Brahmbhatt, S., Gu, J., Kim, K., Hays, J., Kautz, J.: Geometry-aware learning of maps for camera localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Chen, S., Wang, Z., Prisacariu, V.: Direct-PoseNet: absolute pose regression with photometric consistency. In: 3DV (2021)
Google Scholar
Chen, X., He, K.: Exploring simple siamese representation learning. In: CVPR (2021)
Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: CVPRW (2018)
Google Scholar
Engel, J., Koltun, V., Cremers, D.: DSO: direct sparse odometry. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2017)
Google Scholar
Engel, J., Schops, T., Cremers, D.: Semi-dense visual odometry for a monocular camera. In: In Proceedings of IEEE International Conference on Computer Vision (ICCV) (2013)
Google Scholar
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: ICML (2017)
Google Scholar
Glocker, B., Izadi, S., Shotton, J., Criminisi, A.: Real-time RGB-D camera relocalization. In: International Symposium on Mixed and Augmented Reality (ISMAR) (2013)
Google Scholar
Kendall, A., Cipolla, R.: Geometric loss functions for camera pose regression with deep learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Posenet: a convolutional network for real-time 6-DOF camera relocalization. In: International Conference on Computer Vision (2015)
Google Scholar
Kendall, A., Cipolla, R.: Modelling uncertainty in deep learning for camera relocalization. In: ICRA (2016)
Google Scholar
Li, X., Han, K., Li, S., Prisacariu, V.: Dual-resolution correspondence networks. In: NeurIPS (2020)
Google Scholar
Lindenberger, P., Sarlin, P.E., Larsson, V., Pollefeys, M.: Pixel-perfect structure-from-motion with featuremetric refinement. ICCV (2021)
Google Scholar
Martin-Brualla, R., Radwan, N., Sajjadi, M.S.M., Barron, J.T., Dosovitskiy, A., Duckworth, D.: NeRF in the wild: neural radiance fields for unconstrained photo collections. In: CVPR (2021)
Google Scholar
Melekhov, I., Ylioinas, J., Kannala, J., Rahtu, E.: Image-based localization using hourglass networks. In: ICCV Workshops (2017)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: ECCV (2020)
Google Scholar
Moreau, A., Piasco, N., Tsishkou, D., Stanciulescu, B., de La Fortelle, A.: CoordiNet: uncertainty-aware pose regressor for reliable vehicle localization. In: arxiv preprint, arxiv:2103.10796 (2021)
Moreau, A., Piasco, N., Tsishkou, D., Stanciulescu, B., de La Fortelle, A.: LENS: localization enhanced by nerf synthesis. In: CoRL (2021)
Google Scholar
Ng, T., Lopez-Rodriguez, A., Balntas, V., Mikolajczyk, K.: Reassessing the limitations of CNN methods for camera pose regression. In: arXiv preprint arXiv:2108.07260 (2021)
Purkait, P., Zhao, C., Zach, C.: Synthetic view generation for absolute pose regression and image synthesis. In: BMVC (2018)
Google Scholar
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: CVPR (2020)
Google Scholar
Sarlin, P.E., et al.: Back to the feature: learning robust camera localization from pixels to pose. In: CVPR (2021)
Google Scholar
Sattler, T., Leibe, B., Kobbelt, L.: Improving image-based localization by active correspondence search. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 752–765. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_54
Chapter Google Scholar
Shavit, Y., Ferens, R., Keller, Y.: Learning multi-scene absolute pose regression with transformers. In: ICCV (2021)
Google Scholar
Shavit, Y., Ferens, R., Keller, Y.: Paying attention to activation maps in camera pose regression. In: arXiv preprint arXiv:2103.11477 (2021)
Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., Fitzgibbon, A.: Scene coordinate regression forests for camera relocalization in RGB-D images. In: CVPR (2013)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Taira, H., et al.: InLoc: indoor visual localization with dense matching and view synthesis. CVPR (2018)
Google Scholar
Tan, M., EfficientNet, Q.L.: EfficientNet: rethinking model scaling for convolutional neural networks. In: PMLR (2019)
Google Scholar
Valada, A., Radwan, N., Burgard, W.: Deep auxiliary learning for visual localization and odometry. In: ICRA (2018)
Google Scholar
Vaswani, A., .: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Walch, F., Hazirbas, C., Leal-Taixe, L., Sattler, T., Hilsenbeck, S., Cremers, D.: Image-based localization using LSTMs for structured feature correlation. In: International Conference on Computer Vision (2017)
Google Scholar
Wang, Z., Wu, S., Xie, W., Chen, M., Prisacariu, V.A.: NeRF$-$: neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064 (2021)
Wu, J., Ma, L., Hu, X.: Delving deeper into convolutional neural networks for camera relocalization. In: ICRA (2017)
Google Scholar
Yen-Chen, L., Florence, P., Barron, J.T., Rodriguez, A., Isola, P., Lin, T.Y.: iNeRF: inverting neural radiance fields for pose estimation. In: arxiv arXiv:2012.05877 (2020)

Download references

Acknowledgment

The authors thank Michael Hobley, Theo Costain, Lixiong Chen, and Kejie Li for their thoughtful comments. Shuai Chen was supported by gift funding from Huawei.

Author information

Authors and Affiliations

Active Vision Lab, University of Oxford, Oxford, England
Shuai Chen, Xinghui Li, Zirui Wang & Victor A. Prisacariu

Authors

Shuai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xinghui Li
View author publications
You can also search for this author in PubMed Google Scholar
Zirui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Victor A. Prisacariu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Chen .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4161 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Li, X., Wang, Z., Prisacariu, V.A. (2022). DFNet: Enhance Absolute Pose Regression with Direct Feature Matching. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13670. Springer, Cham. https://doi.org/10.1007/978-3-031-20080-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-20080-9_1
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20079-3
Online ISBN: 978-3-031-20080-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching