short-paper

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization

Authors:
Bing Zhang

Dalian Minzu University, Dalian, China

Dalian Minzu University, Dalian, China

0009-0004-6495-7749
View Profile

,
Jing Sun

Dalian Minzu University, Dalian, China

Dalian Minzu University, Dalian, China

0000-0003-1389-1562
View Profile

,
Rui Yan

Dalian Minzu University, Dalian, China

Dalian Minzu University, Dalian, China

0009-0008-7280-1612
View Profile

,
Fuming Sun

Dalian Minzu University, Dalian, China

Dalian Minzu University, Dalian, China

0000-0003-3932-2712
View Profile

,
Fasheng Wang

Dalian Minzu University, Dalian, China

Dalian Minzu University, Dalian, China

0000-0002-0946-0789
View Profile

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New PerspectiveNovember 2023Pages 25–29https://doi.org/10.1145/3607834.3616572

Published:29 October 2023Publication History

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Pages 25–29

ABSTRACT

Cross-view geo-localization aims to locate the target image of the same geographic location from different viewpoints, which is a challenging task in the field of computer vision. Due to the interference of similar images and the surrounding environment of the target building, the matching accuracy is significantly reduced when facing complex scenes. To solve this problem, we propose a cross-view geo-localization method based on dual-branch pattern and multi-scale context to provide a solution for challenging dataset with numerous distractors. This method exploits a Transformer feature extraction network to reduce the loss of fine-grained features. Meanwhile, a dual-branch structure is designed to capture image semantic information and local context information bidirectionally, which can effectively deal with the problem of more interference items in satellite images and improve the accuracy of geographic location tasks in complex scenes. After quantitative experimental verification, both recall rate (Recall) and image retrieval average precision (AP) indicators have been significantly improved on benchmark dataset University-1652 and challenging dataset University-160K, our method can achieve advanced cross-view geo-localization performance.

References

Khawaja Tehseen Ahmed, Shahida Ummesafi, and Amjad Iqbal. 2019. Content based image retrieval using image features information fusion. Information Fusion , Vol. 51 (2019), 76--99. https://doi.org/10.1016/j.inffus.2018.11.004Google ScholarDigital Library
Hritam Basak, Rohit Kundu, Pawan Kumar Singh, Muhammad Fazal Ijaz, Marcin Wo'zniak, and Ram Sarkar. 2022. A union of deep learning and swarm-based optimization for 3D human action recognition. Scientific Reports, Vol. 12, 1 (2022), 5494. https://doi.org/10.1038/s41598-022-09293--8Google ScholarCross Ref
Francesco Castaldo, Amir Zamir, Roland Angst, Francesco Palmieri, and Silvio Savarese. 2015. Semantic cross-view matching. In Proceedings of the IEEE International Conference on Computer Vision Workshops. IEEE Computer Society, 9--17. https://doi.org/10.1109/ICCVW.2015.137Google ScholarDigital Library
Ming Dai, Jianhong Hu, Jiedong Zhuang, and Enhui Zheng. 2021. A transformer-based feature segmentation and region alignment method for UAV-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4376--4389. https://doi.org/10.1109/TCSVT.2021.3135013Google ScholarDigital Library
Lirong Ding, Ji Zhou, Lingxuan Meng, and Zhiyong Long. 2020. A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization. Remote Sensing, Vol. 13, 1 (2020), 47. https://doi.org/10.3390/rs13010047Google ScholarCross Ref
Yalda Ghasemi, Heejin Jeong, Sung Ho Choi, Kyeong-Beom Park, and Jae Yeol Lee. 2022. Deep learning-based object detection in augmented reality: A systematic review. Computers in Industry , Vol. 139 (2022), 103661. https://doi.org/10.1016/j.compind.2022.103661Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 770--778. https://doi.org/10.1109/CVPR.2016.90Google ScholarCross Ref
Jinliang Lin, Zhedong Zheng, Zhun Zhong, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2022. Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Transactions on Image Processing , Vol. 31 (2022), 3780--3792. https://doi.org/10.1109/TIP.2022.3175601Google ScholarCross Ref
Tsung-Yi Lin, Serge Belongie, and James Hays. 2013. Cross-view image geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 891--898. https://doi.org/10.1109/CVPR.2013.120Google ScholarDigital Library
Tsung-Yi Lin, Yin Cui, Serge Belongie, and James Hays. 2015. Learning deep representations for ground-to-aerial geolocalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 5007--5015. https://doi.org/10.1109/CVPR.2015.7299135Google ScholarCross Ref
Liu Liu and Hongdong Li. 2019. Lending orientation to neural networks for cross-view geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 5624--5633. https://doi.org/10.1109/CVPR.2019.00577Google ScholarCross Ref
Zifei Luo, Wenzhu Yang, Yunfeng Yuan, Ruru Gou, and Xiaonan Li. 2023. Semantic segmentation of agricultural images: a survey. Information Processing in Agriculture (2023). https://doi.org/10.1016/j.inpa.2023.02.001Google ScholarCross Ref
Yujian Mo, Yan Wu, Xinneng Yang, Feilin Liu, and Yujun Liao. 2022. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing , Vol. 493 (2022), 626--646. https://doi.org/10.1016/j.neucom.2022.01.005Google ScholarDigital Library
Vipul Narayan, Pawan Kumar Mall, Shashank Awasthi, Swapnita Srivastava, and Anurag Gupta. 2023. FuzzyNet: Medical Image Classification based on GLCM Texture Feature. In 2023 International Conference on Artificial Intelligence and Smart Communication (AISC). IEEE, 769--773.Google ScholarCross Ref
Fatma Outay, Hanan Abdullah Mengash, and Muhammad Adnan. 2020. Applications of unmanned aerial vehicle (UAV) in road safety, traffic and highway infrastructure management: Recent advances and challenges. Transportation Research Part A: Policy and Practice , Vol. 141 (2020), 116--129. https://doi.org/10.1016/j.tra.2020.09.018Google ScholarCross Ref
Krishna Regmi and Mubarak Shah. 2019. Bridging the domain gap for ground-to-aerial image matching. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 470--479. https://doi.org/10.1109/ICCV.2019.00056Google ScholarCross Ref
Royston Rodrigues and Masahiro Tani. 2021. Are these from the same place? seeing the unseen in cross-view image geo-localization. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. IEEE, 3753--3761. https://doi.org/10.1109/WACV48630.2021.00380Google ScholarCross Ref
R Rani Saritha, Varghese Paul, and P Ganesh Kumar. 2019. Content based image retrieval using deep learning process. Cluster Computing , Vol. 22 (2019), 4187--4200. https://doi.org/10.1007/s10586-018--1731-0Google ScholarDigital Library
Olivier Saurer, Georges Baatz, Kevin Köser, L'ubor Ladickỳ, and Marc Pollefeys. 2016. Image based geo-localization in the alps. International Journal of Computer Vision , Vol. 116, 3 (2016), 213--225. https://doi.org/10.1007/s11263-015-0830-0Google ScholarDigital Library
Yujiao Shi, Liu Liu, Xin Yu, and Hongdong Li. 2019. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems , Vol. 32 (2019), 10090--10100.Google Scholar
Yujiao Shi, Xin Yu, Dylan Campbell, and Hongdong Li. 2020a. Where am i looking at? joint location and orientation estimation by cross-view matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 4064--4072. https://doi.org/10.1109/CVPR42600.2020.00412Google ScholarCross Ref
Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, and Hongdong Li. 2020b. Optimal feature transport for cross-view image geo-localization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. AAAI Press, 11990--11997. https://doi.org/10.48550/arXiv.1907.05021Google ScholarCross Ref
Jing Sun, Rui Yan, Bing Zhang, Bing Zhu, and Fuming Sun. 2023. A cross-view geo-localization method guided by relation-aware global attention. Multimedia Systems, Vol. 29, 4 (2023), 2205--2216. https://doi.org/10.1007/s00530-023-01101--1Google ScholarDigital Library
Xiaoyang Tian, Jie Shao, Deqiang Ouyang, and Heng Tao Shen. 2021. UAV-satellite view synthesis for cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4804--4815. https://doi.org/10.1109/TCSVT.2021.3121987Google ScholarDigital Library
Aysim Toker, Qunjie Zhou, Maxim Maximov, and Laura Leal-Taixé. 2021. Coming down to earth: Satellite-to-street view synthesis for geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation, 6488--6497. https://doi.org/10.1109/CVPR46437.2021.00642Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems , Vol. 30 (2017). https://doi.org/10.48550/arXiv.1706.03762Google ScholarCross Ref
Pin Wang, En Fan, and Peng Wang. 2021a. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognition Letters , Vol. 141 (2021), 61--67. https://doi.org/10.1016/j.patrec.2020.07.042Google ScholarCross Ref
Tingyu Wang, Zhedong Zheng, Chenggang Yan, Jiyong Zhang, Yaoqi Sun, Bolun Zheng, and Yi Yang. 2021b. Each part matters: Local patterns facilitate cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 2 (2021), 867--879. https://doi.org/10.1109/TCSVT.2021.3061265Google ScholarCross Ref
Tingyu Wang, Zhedong Zheng, Zunjie Zhu, Yuhan Gao, Yi Yang, and Chenggang Yan. 2022b. Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization. arXiv preprint arXiv:2211.05296 (2022). https://doi.org/10.48550/arXiv.2211.05296Google ScholarCross Ref
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2022a. Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media , Vol. 8, 3 (2022), 415--424. https://doi.org/10.1007/s41095-022-0274--8Google ScholarCross Ref
Scott Workman, Richard Souvenir, and Nathan Jacobs. 2015. Wide-area image geolocalization with aerial reference imagery. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 3961--3969. https://doi.org/10.1109/ICCV.2015.451Google ScholarDigital Library
Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, and Yu-Gang Jiang. 2023. Svformer: Semi-supervised video transformer for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 18816--18826. https://doi.org/10.48550/arXiv.2211.13222Google ScholarCross Ref
Hongji Yang, Xiufan Lu, and Yingying Zhu. 2021. Cross-view geo-localization with evolving transformer. arXiv preprint arXiv:2107.00842 , Vol. abs/2107.00842 (2021). https://doi.org/10.48550/arXiv.2107.00842Google ScholarCross Ref
Menghua Zhai, Zachary Bessinger, Scott Workman, and Nathan Jacobs. 2017. Predicting ground-level scene layout from aerial imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 867--875. https://doi.org/10.1109/CVPR.2017.440Google ScholarCross Ref
Dan Zhang, Mao Ye, Yiguang Liu, Lin Xiong, and Lihua Zhou. 2022. Multi-source unsupervised domain adaptation for object detection. Information Fusion , Vol. 78 (2022), 138--148. https://doi.org/10.1016/j.inffus.2021.09.011Google ScholarDigital Library
Zhedong Zheng, Yujiao Shi, Tingyu Wang, Jun Liu, Jianwu Fang, Yunchao Wei, and Tat-seng Chua. 2023. UAVs in Multimedia: Capturing the World from a New Perspective. In Proceedings of the 31th ACM International Conference on Multimedia Workshop.Google Scholar
Zhedong Zheng, Yunchao Wei, and Yi Yang. 2020. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th ACM International Conference on Multimedia. ACM, 1395--1403. https://doi.org/10.1145/3394171.3413896Google ScholarDigital Library
Jiedong Zhuang, Ming Dai, Xuruoyan Chen, and Enhui Zheng. 2021. A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sensing, Vol. 13, 19 (2021), 3979. https://doi.org/10.3390/rs13193979 ioGoogle ScholarCross Ref

Index Terms

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

AFPN: Attention-guided Feature Partition Network for Cross-view Geo-localization
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

Cross-view geo-localization is to retrieve images of the same geographic target from different platforms. Since drones have received increasing attention in recent years because of their ability to capture high-quality multimedia data from the sky, we ...
Read More
Image and Object Geo-Localization
Abstract
The concept of geo-localization broadly refers to the process of determining an entity’s geographical location, typically in the form of Global Positioning System (GPS) coordinates. The entity of interest may be an image, a sequence of images, a ...
Read More
Learning discriminative representations via variational self-distillation for cross-view geo-localization
Abstract
Cross-view geo-localization is to localize the same geographic target in images from different perspectives, e.g., satellite-view and drone-view. The primary challenge faced by existing methods is the large visual appearance changes ...
Highlights
- Variational self-distillation is used for cross-view geo-localization.
- Square-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective
November 2023
86 pages
ISBN:9798400702860
DOI:10.1145/3607834
General Chairs:
Zhedong Zheng
National University of Singapore, Singapore
,
Yujiao Shi
The Australian National University, Australia
,
Tingyu Wang
Hangzhou Dianzi University, China
,
Jun Liu
Singapore University of Technology and Design, Singapore
,
Jianwu Fang
Chang'an University, China
,
Yunchao Wei
Beijing Jiaotong University, China
,
Tat-seng Chua
National University of Singapore, Singapore
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
drone
dual-branch pattern
geo-localization
transformer network
Qualifiers
- short-paper
Conference
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 60
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

ABSTRACT

References

Cited By

Index Terms

Recommendations

AFPN: Attention-guided Feature Partition Network for Cross-view Geo-localization

Image and Object Geo-Localization

Learning discriminative representations via variational self-distillation for cross-view geo-localization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Dual-branch Pattern and Multi-scale Context Facilitate Cross-view Geo-localization

UAVM '23: Proceedings of the 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective

ABSTRACT

References

Cited By

Index Terms

Recommendations

AFPN: Attention-guided Feature Partition Network for Cross-view Geo-localization

Image and Object Geo-Localization

Learning discriminative representations via variational self-distillation for cross-view geo-localization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media