WAFormer: Ship Detection in SAR Images Based on Window-Aware Swin-Transformer

Wang, Zhicheng; Wang, Lingfeng; Wang, Wuqi; Tian, Shanshan; Zhang, Zhiwei

doi:10.1007/978-3-031-18913-5_41

Zhicheng Wang^15,17,
Lingfeng Wang¹⁶,
Wuqi Wang¹⁷,
Shanshan Tian¹⁷ &
…
Zhiwei Zhang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13536))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1712 Accesses
1 Citations

Abstract

The research work of synthetic aperture radar (SAR) image target detection based on deep learning has made great progress. However, most of them apply the methods applicable to optical images directly to SAR images, ignoring the characteristics of targets in SAR images. For instance, the size of target in SAR images is usually small and volatile. Meanwhile, the target distribution is relatively sparse and the detection is affected by the complex background noise. In this paper, we propose an improved backbone network, called WAFormer, for ship targets detection in SAR images, based on the latest Swin-Transformer. WAFormer improves the local window attention mechanism of Swin-Transformer by introducing the new window settings. Our model can be more suitable to match the shape of the target, so that it obtains more accurate detection in SAR images. Experimental results show that the WAFormer achieves 74.4% mAP on the Official-SSDD SAR dataset, surpassing Swin-Transformer by +1.0, especially for large targets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jiao, J., et al.: A densely connected end-to-end neural network for multiscale and multiscene SAR ship detection. IEEE Access 6, 20881–20892 (2018)
Article Google Scholar
Chang, Y.-L., Anagaw, A., Chang, L., Wang, Y.C., Hsiao, C.-Y., Lee, W.-H.: Ship detection based on YOLOv2 for SAR imagery. Remote Sens. 11(7), 786 (2019)
Article Google Scholar
Zhang, T., Zhang, X.: High-speed ship detection in SAR images based on a grid convolutional neural network. Remote Sens. 11(10), 1206 (2019)
Article Google Scholar
An, Q., Pan, Z., Liu, L., You, H.: DRBox-v2: an improved detector with rotatable boxes for target detection in SAR images. IEEE Trans. Geosci. Remote Sens. 57(11), 8333–8349 (2019)
Article Google Scholar
Zhang, T., et al.: SAR ship detection dataset (SSDD): official release and comprehensive data analysis. Remote Sens. 13(18), 3690 (2021)
Article Google Scholar
Li, J., Qu, C., Shao, J.: Ship detection in SAR images based on an improved faster R-CNN. In: 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), pp. 1–6. IEEE (2017)
Google Scholar
Wei, S., Zeng, X., Qu, Q., Wang, M., Su, H., Shi, J.: HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 8, 120234–120254 (2020)
Article Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Cui, Z., Li, Q., Cao, Z., Liu, N.: Dense attention pyramid networks for multi-scale ship detection in SAR images. IEEE Trans. Geosci. Remote Sens. 57(11), 8983–8997 (2019)
Article Google Scholar
Zhao, Y., Zhao, L., Xiong, B., Kuang, G.: Attention receptive pyramid network for ship detection in SAR images. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 13, 2738–2756 (2020)
Article Google Scholar
Cui, Z., Wang, X., Liu, N., Cao, Z., Yang, J.: Ship detection in large-scale SAR images via spatial shuffle-group enhance attention. IEEE Trans. Geosci. Remote Sens. 59(1), 379–391 (2020)
Article Google Scholar
Fu, J., Sun, X., Wang, Z., Fu, K.: An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Trans. Geosci. Remote Sens. 59(2), 1331–1344 (2020)
Article Google Scholar
Guo, H., Yang, X., Wang, N., Gao, X.: A CenterNet++ model for ship detection in SAR images. Pattern Recogn. 112, 107787 (2021)
Article Google Scholar
Tang, L., Tang, W., Qu, X., Han, Y., Wang, W., Zhao, B.: A scale-aware pyramid network for multi-scale object detection in SAR images. Remote Sens. 14(4), 973 (2022)
Article Google Scholar
Xu, X., Zhang, X., Zhang, T.: Lite-YOLOv5: a lightweight deep learning detector for on-board ship detection in large-scene sentinel-1 SAR images. Remote Sens. 14(4), 1018 (2022)
Article Google Scholar
Xia, R., et al.: CRTransSar: a visual transformer based on contextual joint representation learning for SAR ship detection. Remote Sens. 14(6), 1488 (2022)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers and distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021)
Google Scholar
Dong, X., et al.: CSWin transformer: A general vision transformer backbone with cross-shaped windows. arXiv preprint arXiv:2107.00652 (2021)
Chen, K., et al.: MMDetection: Open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Sun, P., et al.: Sparse R-CNN: end-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14454–14463 (2021)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: International Conference on Machine Learning, pp. 1243–1252. PMLR (2017)
Google Scholar
Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 (2018)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
Zhicheng Wang
College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
Lingfeng Wang
Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Zhicheng Wang, Wuqi Wang, Shanshan Tian & Zhiwei Zhang

Authors

Zhicheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lingfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wuqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shanshan Tian
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingfeng Wang .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Z., Wang, L., Wang, W., Tian, S., Zhang, Z. (2022). WAFormer: Ship Detection in SAR Images Based on Window-Aware Swin-Transformer. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-18913-5_41
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

WAFormer: Ship Detection in SAR Images Based on Window-Aware Swin-Transformer