Retrieval Across Optical and SAR Images with Deep Neural Network

Zhang, Yifan; Zhou, Wengang; Li, Houqiang

doi:10.1007/978-3-030-00776-8_36

Yifan Zhang¹⁸,
Wengang Zhou¹⁸ &
Houqiang Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11164))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3930 Accesses

Abstract

In this paper, we are dedicated to the cross-modal image retrieval between optical images and synthetic aperture radar (SAR) images. This cross-modal retrieval is a challenging task due to the different imaging mechanisms and huge heterogeneity gap. Here, we design a two-stream fully convolutional network to tackle this issue. The network maps the optical and SAR images to a common feature space for comparison. For different modal images, the comparable features are obtained by feeding them into the corresponding branch. Each branch fuses two types of features in a weighted manner. These two kinds of features root in the pooling features of VGG16 at different depths, but are refined by the well-designed channels-aggregated convolution (CAC) operation as well as semi-average pooling (SAP) operation. In order to get a better model, an extensible training approach is proposed. The training of the model is from the local to the whole. Besides, we collect an optical/SAR image retrieval (OSR) dataset. Comprehensive experiments on this dataset demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robust registration of SAR and optical images based on deep learning and improved Harris algorithm

Article Open access 07 April 2022

Homography Augmented Momentum Contrastive Learning for SAR Image Retrieval

DAP-Net: enhancing SAR target recognition with dual-channel attention and polarimetric features

Article 08 February 2025

Notes

References

Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: CNN architecture for weakly supervised place recognition. In: CVPR, pp. 5297–5307 (2016)
Google Scholar
Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: ICCV, pp. 1269–1277 (2015)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. TPAMI 39(12), 2481–2495 (2017)
Article Google Scholar
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: CVPR, pp. 2874–2883 (2016)
Google Scholar
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Chapter Google Scholar
Cheng, G., Zhou, P., Han, J.: Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. TGARS 54(12), 7405–7415 (2016)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR. vol. 1, pp. 539–546. IEEE (2005)
Google Scholar
Eysenck, M.W., Keane, M.T.: Cognitive psychology: A student’s handbook. Psychology press, New York (2013)
Book Google Scholar
Fukui, K., Okuno, A., Shimodaira, H.: Image and tag retrieval by leveraging image-group links with multi-domain graph embedding. In: ICIP, pp. 221–225. IEEE (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: ICCV, pp. 1026–1034 (2015)
Google Scholar
Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. TIP 25(12), 5814–5827 (2016)
MathSciNet Google Scholar
Hong, R., Zhang, L., Tao, D.: Unified photo enhancement by discovering aesthetic communities from flickr. TIP 25(3), 1124–1135 (2016)
MathSciNet Google Scholar
Hong, R., Zhang, L., Zhang, C., Zimmermann, R.: Flickr circles: aesthetic tendency discovery by multi-view regularized topic modeling. TMM 18(8), 1555–1567 (2016)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM, pp. 675–678. ACM (2014)
Google Scholar
Li, Z., Li, Y., Gao, Y., Liu, Y.: Fast cross-scenario clothing retrieval based on indexing deep features. In: Chen, E., Gong, Y., Tie, Y. (eds.) PCM 2016. LNCS, vol. 9916, pp. 107–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48890-5_11
Chapter Google Scholar
Luo, M., Chang, X., Li, Z., Nie, L., Hauptmann, A.G., Zheng, Q.: Simple to complex cross-modal learning to rank. CVIU 163, 67–77 (2017)
Google Scholar
Mnih, V.: Machine learning for aerial image labeling. Ph.D. thesis, University of Toronto (Canada) (2013)
Google Scholar
Mou, L., Schmitt, M., Wang, Y., Zhu, X.X.: A CNN for the identification of corresponding patches in SAR and optical imagery of urban scenes. In: JURSE, pp. 1–4. IEEE (2017)
Google Scholar
Qi, Y., Song, Y.Z., Zhang, H., Liu, J.: Sketch-based image retrieval via siamese convolutional neural network. In: ICIP, pp. 2460–2464. IEEE (2016)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)
Article Google Scholar
Sharma, A., Jacobs, D.W.: Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR, pp. 593–600. IEEE (2011)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
Tambo, A.L., Bhanu, B.: Dynamic bi-modal fusion of images for the segmentation of pollen tubes in video. In: ICIP, pp. 148–152. IEEE (2015)
Google Scholar
Wang, K., He, R., Wang, L., Wang, W., Tan, T.: Joint feature selection and subspace learning for cross-modal retrieval. TPAMI 38(10), 2010–2023 (2016)
Article Google Scholar
Wang, K., Yin, Q., Wang, W., Wu, S., Wang, L.: A comprehensive survey on cross-modal retrieval (2016). arXiv preprint arXiv:1607.06215
Wang, Y., Zhu, X.X., Zeisl, B., Pollefeys, M.: Fusing meter-resolution 4-d insar point clouds and optical images for semantic urban infrastructure monitoring. TGARS 55(1), 14–26 (2017)
Google Scholar
Wegner, J.D., Ziehn, J.R., Soergel, U.: Combining high-resolution optical and insar features for height estimation of buildings with flat roofs. TGARS 52(9), 5840–5854 (2014)
Google Scholar
Zhai, X., Peng, Y., Xiao, J.: Cross-modality correlation propagation for cross-media retrieval. In: ICASSP, pp. 2337–2340. IEEE (2012)
Google Scholar
Zhong, P., Gong, Z., Li, S., Schönlieb, C.B.: Learning to diversify deep belief networks for hyperspectral image classification. TGARS 55(6), 3516–3530 (2017)
Google Scholar

Download references

Acknowledgement

This work was supported in part by 973 Program under Contract 2015CB351803, by Natural Science Foundation of China (NSFC) under Contract 61390514 and 61331017, and by the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

CAS Key Laboratory of Technology in Geo-spatial, Information Processing and Application System, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
Yifan Zhang, Wengang Zhou & Houqiang Li

Authors

Yifan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wengang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Houqiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wengang Zhou .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 736 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Zhou, W., Li, H. (2018). Retrieval Across Optical and SAR Images with Deep Neural Network. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-00776-8_36
Published: 19 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics