Skip to main content

Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15095))

Included in the following conference series:

  • 416 Accesses

Abstract

Cross-view geolocalization identifies the geographic location of street view images by matching them with a georeferenced satellite database. Significant challenges arise due to the drastic appearance and geometry differences between views. In this paper, we propose a new approach for cross-view image geo-localization, i.e., the Panorama-BEV Co-Retrieval Network. Specifically, by utilizing the ground plane assumption and geometric relations, we convert street view panorama images into the BEV view, reducing the gap between street panoramas and satellite imagery. In the existing retrieval of street view panorama images and satellite images, we introduce BEV and satellite image retrieval branches for collaborative retrieval. By retaining the original street view retrieval branch, we overcome the limited perception range issue of BEV representation. Our network enables comprehensive perception of both the global layout and local details around the street view capture locations. Additionally, we introduce CVGlobal, a global cross-view dataset that is closer to real-world scenarios. This dataset adopts a more realistic setup, with street view directions not aligned with satellite images. CVGlobal also includes cross-regional, cross-temporal, and street view to map retrieval tests, enabling a comprehensive evaluation of algorithm performance. Our method excels in multiple tests on common cross-view datasets such as CVUSA, CVACT, VIGOR, and our newly introduced CVGlobal, surpassing the current state-of-the-art approaches. The code and datasets can be found at https://github.com/yejy53/EP-BEV.

This work was partially done during the internship at Shanghai AI Lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://svd360.istreetview.com/.

  2. 2.

    https://developers.google.com/maps/documentation/maps-static/.

References

  1. Bansal, M., Sawhney, H.S., Cheng, H., Daniilidis, K.: Geo-localization of street views with aerial image databases. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1125–1128 (2011)

    Google Scholar 

  2. Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8391–8400 (2019)

    Google Scholar 

  3. Deuser, F., Habel, K., Oswald, N.: Sample4geo: hard negative sampling for cross-view geo-localisation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16847–16856 (2023)

    Google Scholar 

  4. Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Article  MathSciNet  Google Scholar 

  5. Hu, S., Feng, M., Nguyen, R.M., Lee, G.H.: CVM-Net: cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7258–7267 (2018)

    Google Scholar 

  6. Hu, S., Lee, G.H.: Image-based geo-localization using satellite imagery. Int. J. Comput. Vision 128(5), 1205–1219 (2020)

    Article  Google Scholar 

  7. Lin, T.Y., Belongie, S., Hays, J.: Cross-view image geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 891–898 (2013)

    Google Scholar 

  8. Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5624–5633 (2019)

    Google Scholar 

  9. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)

    Google Scholar 

  10. Pan, B., Sun, J., Leung, H.Y.T., Andonian, A., Zhou, B.: Cross-view semantic segmentation for sensing surroundings. IEEE Robot. Autom. Lett. 5(3), 4867–4873 (2020)

    Article  Google Scholar 

  11. Peng, L., Chen, Z., Fu, Z., Liang, P., Cheng, E.: Bevsegformer: bird’s eye view semantic segmentation from arbitrary camera rigs. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5935–5943 (2023)

    Google Scholar 

  12. Reiher, L., Lampe, B., Eckstein, L.: A sim2real deep learning approach for the transformation of images from multiple vehicle-mounted cameras to a semantically segmented image in bird’s eye view. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–7. IEEE (2020)

    Google Scholar 

  13. Sarlin, P.E., et al.: Orienternet: visual localization in 2D public maps with neural matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21632–21642 (2023)

    Google Scholar 

  14. Sarlin, P.E., Trulls, E., Pollefeys, M., Hosang, J., Lynen, S.: Snap: self-supervised neural maps for visual positioning and semantic understanding. In: Advances in Neural Information Processing Systems, vol. 36 (2024)

    Google Scholar 

  15. Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for image based cross-view geo-localization. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  16. Shi, Y., Wu, F., Perincherry, A., Vora, A., Li, H.: Boosting 3-DoF ground-to-satellite camera localization accuracy via geometry-guided cross-view transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 21516–21526 (2023)

    Google Scholar 

  17. Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? Joint location and orientation estimation by cross-view matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4064–4072 (2020)

    Google Scholar 

  18. Shi, Y., Yu, X., Liu, L., Campbell, D., Koniusz, P., Li, H.: Accurate 3-DoF camera geo-localization via ground-to-satellite image matching. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 2682–2697 (2022)

    Google Scholar 

  19. Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11990–11997 (2020)

    Google Scholar 

  20. Thoma, J., Paudel, D.P., Chhatkuli, A., Probst, T., Gool, L.V.: Mapping, localization and path planning for image-based navigation using visual features and map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7383–7391 (2019)

    Google Scholar 

  21. Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3608–3616 (2017)

    Google Scholar 

  22. Toker, A., Zhou, Q., Maximov, M., Leal-Taixé, L.: Coming down to earth: satellite-to-street view synthesis for geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6488–6497 (2021)

    Google Scholar 

  23. Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 494–509. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_30

    Chapter  Google Scholar 

  24. Wang, T., et al.: Each part matters: local patterns facilitate cross-view geo-localization. IEEE Trans. Circuits Syst. Video Technol. 32(2), 867–879 (2021)

    Article  Google Scholar 

  25. Wang, X., Xu, R., Cui, Z., Wan, Z., Zhang, Y.: Fine-grained cross-view geo-localization using a correlation-aware homography estimator. In: Advances in Neural Information Processing Systems, vol. 36 (2024)

    Google Scholar 

  26. Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3961–3969 (2015)

    Google Scholar 

  27. Yang, H., Lu, X., Zhu, Y.: Cross-view geo-localization with layer-to-layer transformer. Adv. Neural. Inf. Process. Syst. 34, 29009–29020 (2021)

    Google Scholar 

  28. Ye, J., et al.: SG-BEV: satellite-guided bev fusion for cross-view semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 27748–27757 (2024)

    Google Scholar 

  29. Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 867–875 (2017)

    Google Scholar 

  30. Zhang, X., Li, X., Sultani, W., Zhou, Y., Wshah, S.: Cross-view geo-localization via learning disentangled geometric layout correspondence. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 3480–3488 (2023)

    Google Scholar 

  31. Zheng, Z., Wei, Y., Yang, Y.: University-1652: a multi-view multi-source benchmark for drone-based geo-localization. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1395–1403 (2020)

    Google Scholar 

  32. Zhu, S., Shah, M., Chen, C.: Transgeo: transformer is all you need for cross-view image geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1162–1171 (2022)

    Google Scholar 

  33. Zhu, S., Yang, T., Chen, C.: Vigor: cross-view image geo-localization beyond one-to-one retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2021)

    Google Scholar 

  34. Zhu, Y., Chen, S., Lu, X., Chen, J.: Cross-view image synthesis from a single image with progressive parallel GAN. IEEE Trans. Geosci. Remote Sens. (2023)

    Google Scholar 

  35. Zhu, Y., Yang, H., Lu, Y., Huang, Q.: Simple, effective and general: a new backbone for cross-view image geo-localization. arXiv preprint arXiv:2302.01572 (2023)

Download references

Acknowledgements

This project was funded in part by National Natural Science Foundation of China (Grant No. 42201358) and Shanghai AI Lab.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Weijia Li or Conghui He .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15299 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, J. et al. (2025). Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15095. Springer, Cham. https://doi.org/10.1007/978-3-031-72913-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72913-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72912-6

  • Online ISBN: 978-3-031-72913-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics