Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network

Ye, Junyan; Lv, Zhutao; Li, Weijia; Yu, Jinhua; Yang, Haote; Zhong, Huaping; He, Conghui

doi:10.1007/978-3-031-72913-3_5

Junyan Ye^13,14,
Zhutao Lv¹³,
Weijia Li¹³,
Jinhua Yu¹³,
Haote Yang¹⁴,
Huaping Zhong¹⁵ &
…
Conghui He^14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15095))

Included in the following conference series:

European Conference on Computer Vision

416 Accesses

Abstract

Cross-view geolocalization identifies the geographic location of street view images by matching them with a georeferenced satellite database. Significant challenges arise due to the drastic appearance and geometry differences between views. In this paper, we propose a new approach for cross-view image geo-localization, i.e., the Panorama-BEV Co-Retrieval Network. Specifically, by utilizing the ground plane assumption and geometric relations, we convert street view panorama images into the BEV view, reducing the gap between street panoramas and satellite imagery. In the existing retrieval of street view panorama images and satellite images, we introduce BEV and satellite image retrieval branches for collaborative retrieval. By retaining the original street view retrieval branch, we overcome the limited perception range issue of BEV representation. Our network enables comprehensive perception of both the global layout and local details around the street view capture locations. Additionally, we introduce CVGlobal, a global cross-view dataset that is closer to real-world scenarios. This dataset adopts a more realistic setup, with street view directions not aligned with satellite images. CVGlobal also includes cross-regional, cross-temporal, and street view to map retrieval tests, enabling a comprehensive evaluation of algorithm performance. Our method excels in multiple tests on common cross-view datasets such as CVUSA, CVACT, VIGOR, and our newly introduced CVGlobal, surpassing the current state-of-the-art approaches. The code and datasets can be found at https://github.com/yejy53/EP-BEV.

This work was partially done during the internship at Shanghai AI Lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ConGeo: Robust Cross-View Geo-Localization Across Ground View Variations

Image-Based Geo-Localization Using Satellite Imagery

Article 10 June 2019

Feature Alignment Method for Cross-View Image Geo-localization

Notes

References

Bansal, M., Sawhney, H.S., Cheng, H., Daniilidis, K.: Geo-localization of street views with aerial image databases. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1125–1128 (2011)
Google Scholar
Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8391–8400 (2019)
Google Scholar
Deuser, F., Habel, K., Oswald, N.: Sample4geo: hard negative sampling for cross-view geo-localisation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16847–16856 (2023)
Google Scholar
Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Hu, S., Feng, M., Nguyen, R.M., Lee, G.H.: CVM-Net: cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7258–7267 (2018)
Google Scholar
Hu, S., Lee, G.H.: Image-based geo-localization using satellite imagery. Int. J. Comput. Vision 128(5), 1205–1219 (2020)
Article Google Scholar
Lin, T.Y., Belongie, S., Hays, J.: Cross-view image geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 891–898 (2013)
Google Scholar
Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5624–5633 (2019)
Google Scholar
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Google Scholar
Pan, B., Sun, J., Leung, H.Y.T., Andonian, A., Zhou, B.: Cross-view semantic segmentation for sensing surroundings. IEEE Robot. Autom. Lett. 5(3), 4867–4873 (2020)
Article Google Scholar
Peng, L., Chen, Z., Fu, Z., Liang, P., Cheng, E.: Bevsegformer: bird’s eye view semantic segmentation from arbitrary camera rigs. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5935–5943 (2023)
Google Scholar
Reiher, L., Lampe, B., Eckstein, L.: A sim2real deep learning approach for the transformation of images from multiple vehicle-mounted cameras to a semantically segmented image in bird’s eye view. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–7. IEEE (2020)
Google Scholar
Sarlin, P.E., et al.: Orienternet: visual localization in 2D public maps with neural matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21632–21642 (2023)
Google Scholar
Sarlin, P.E., Trulls, E., Pollefeys, M., Hosang, J., Lynen, S.: Snap: self-supervised neural maps for visual positioning and semantic understanding. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for image based cross-view geo-localization. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Shi, Y., Wu, F., Perincherry, A., Vora, A., Li, H.: Boosting 3-DoF ground-to-satellite camera localization accuracy via geometry-guided cross-view transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 21516–21526 (2023)
Google Scholar
Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? Joint location and orientation estimation by cross-view matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4064–4072 (2020)
Google Scholar
Shi, Y., Yu, X., Liu, L., Campbell, D., Koniusz, P., Li, H.: Accurate 3-DoF camera geo-localization via ground-to-satellite image matching. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 2682–2697 (2022)
Google Scholar
Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11990–11997 (2020)
Google Scholar
Thoma, J., Paudel, D.P., Chhatkuli, A., Probst, T., Gool, L.V.: Mapping, localization and path planning for image-based navigation using visual features and map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7383–7391 (2019)
Google Scholar
Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3608–3616 (2017)
Google Scholar
Toker, A., Zhou, Q., Maximov, M., Leal-Taixé, L.: Coming down to earth: satellite-to-street view synthesis for geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6488–6497 (2021)
Google Scholar
Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 494–509. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_30
Chapter Google Scholar
Wang, T., et al.: Each part matters: local patterns facilitate cross-view geo-localization. IEEE Trans. Circuits Syst. Video Technol. 32(2), 867–879 (2021)
Article Google Scholar
Wang, X., Xu, R., Cui, Z., Wan, Z., Zhang, Y.: Fine-grained cross-view geo-localization using a correlation-aware homography estimator. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Google Scholar
Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3961–3969 (2015)
Google Scholar
Yang, H., Lu, X., Zhu, Y.: Cross-view geo-localization with layer-to-layer transformer. Adv. Neural. Inf. Process. Syst. 34, 29009–29020 (2021)
Google Scholar
Ye, J., et al.: SG-BEV: satellite-guided bev fusion for cross-view semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 27748–27757 (2024)
Google Scholar
Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 867–875 (2017)
Google Scholar
Zhang, X., Li, X., Sultani, W., Zhou, Y., Wshah, S.: Cross-view geo-localization via learning disentangled geometric layout correspondence. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 3480–3488 (2023)
Google Scholar
Zheng, Z., Wei, Y., Yang, Y.: University-1652: a multi-view multi-source benchmark for drone-based geo-localization. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1395–1403 (2020)
Google Scholar
Zhu, S., Shah, M., Chen, C.: Transgeo: transformer is all you need for cross-view image geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1162–1171 (2022)
Google Scholar
Zhu, S., Yang, T., Chen, C.: Vigor: cross-view image geo-localization beyond one-to-one retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2021)
Google Scholar
Zhu, Y., Chen, S., Lu, X., Chen, J.: Cross-view image synthesis from a single image with progressive parallel GAN. IEEE Trans. Geosci. Remote Sens. (2023)
Google Scholar
Zhu, Y., Yang, H., Lu, Y., Huang, Q.: Simple, effective and general: a new backbone for cross-view image geo-localization. arXiv preprint arXiv:2302.01572 (2023)

Download references

Acknowledgements

This project was funded in part by National Natural Science Foundation of China (Grant No. 42201358) and Shanghai AI Lab.

Author information

Authors and Affiliations

Sun Yat-Sen University, Guangzhou, China
Junyan Ye, Zhutao Lv, Weijia Li & Jinhua Yu
Shanghai AI Laboratory, Shanghai, China
Junyan Ye, Haote Yang & Conghui He
SenseTime Research, Shanghai, China
Huaping Zhong & Conghui He

Authors

Junyan Ye
View author publications
You can also search for this author in PubMed Google Scholar
Zhutao Lv
View author publications
You can also search for this author in PubMed Google Scholar
Weijia Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinhua Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haote Yang
View author publications
You can also search for this author in PubMed Google Scholar
Huaping Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Conghui He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Weijia Li or Conghui He .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15299 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ye, J. et al. (2025). Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15095. Springer, Cham. https://doi.org/10.1007/978-3-031-72913-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-72913-3_5
Published: 02 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72912-6
Online ISBN: 978-3-031-72913-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network