Latent Diffusion Model-Based T2T-ViT for SAR Ship Classification

Qi, Yuhang; Wang, Lu; Li, Kaiyu; Liu, Haodong; Zhao, Chunhui

doi:10.1007/978-981-99-9640-7_22

Yuhang Qi¹¹,
Lu Wang^11,12,
Kaiyu Li¹¹,
Haodong Liu¹¹ &
…
Chunhui Zhao^11,12

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2013))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

190 Accesses

Abstract

Recently, deep learning methods have been applied to ship classification in Synthetic Aperture Radar (SAR) images. However, because of the problem of imbalanced and insufficient samples in the SAR ship datasets, accurately identifying SAR ships still poses challenges. In this paper, we propose an improved T2T-ViT model based on the latent diffusion model, which expands the data set through image generation, and adds the SE attention mechanism to adjust the channel weight. To evaluate the effectiveness of the proposed method, training and experiments were conducted on the OpenSARShip 2.0 dataset. Our proposed model, in accordance with experimental results, achieves better recognition accuracy compared with existing models.

This work was supported in part by the National Natural Science Foundation of China under Grant 62271162 and 61971153, and Natural Science Foundation of Heilongjiang Province (YQ2022E016).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Survey of Ship Detection and Classification Techniques

YOLOv3 Remote Sensing SAR Ship Image Detection

Efficient ship detection in sar images with dynamic feature smoothing and visual module using omni-dimensional dynamic large-scale convolution

Article 26 January 2024

References

Ren, H., Yu, X., Zou, L., Xhou, Y., Wang, X.: Joint supervised dictionary and classifier learning for multi-view SAR image classification. IEEE Access 7, 165127–165142 (2019)
Article Google Scholar
Yang, T., Zhu, J., Liu, J.: SAR image target detection and recognition based on deep network. In: SAR in Big Data Era, pp. 1–4 (2019)
Google Scholar
Guo, D., Chen, B., Zheng, M., Liu, H.: SAR automatic target recognition based on supervised deep variational autoencoding model. IEEE Trans. Aeros. Electron. Syst. 57(6), 4313–4328 (2019)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Li, B., et al.: OpenSARShip 2.0: a large-volume dataset for deeper interpretation of ship targets in sentinel-1 imagery. In: Big Data Era: Models, Methods and Applications, pp. 1–5 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Jaegle, A., et al.: Perceiver: general perception with iterative attention. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 4651–4664 (2021)
Google Scholar
He, J., Wang, Y., Liu, H.: Ship classification in medium-resolution SAR images via densely connected triplet CNNs integrating fisher discrimination regularized metric learning. IEEE Trans. Geosci. Remote Sens. 59(4), 3022–3039 (1998)
Article Google Scholar
Xu, Y., Lang, H.: Ship classification in SAR images with geometric transfer metric learning. IEEE Trans. Geosci. Remote Sens. 59(8), 6799–6813 (2021)
Article Google Scholar
Xiong, G., Xi, Y., Chen, D., Yu, W.: Dual-polarization SAR ship target recognition based on mini hourglass region extraction and dual-channel efficient fusion network. IEEE Access 9, 29078–29089 (2021)
Article Google Scholar
Wang, C., et al.: Semisupervised learning-based SAR ATR via self-consistent augmentation. IEEE Trans. Geosci. Remote Sens. 59(6), 4862–4873 (2021)
Article Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Veerapaneni, S.-K., Biros, G.: A high-order solver for the heat equation in 1D domains with moving boundaries. Siam J. Sci. Comput. 29, 2581–2606 (2007)
Article MathSciNet Google Scholar
Han, K., et al.: Transformer in transformer. In: Neural Information Processing Systems, pp. 15908–15919 (2021)
Google Scholar
Howard, A.-G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 432–445 (2017)
Google Scholar
Huang, G., Liu, Z., Laurens, V., Weinberger, K.: Densely connected convolutional networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2261–2269 (2016)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.,: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2016)
Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Houlsby, N.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: International Conference on Learning Representations, pp. 1–22 (2021)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9992–10002 (2021)
Google Scholar
Yuan, L., et al.: Tokens-to-Token ViT: training vision transformers from scratch on ImageNet. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 538–547 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information and Communication Engineering, Harbin Engineering University, Harbin, China
Yuhang Qi, Lu Wang, Kaiyu Li, Haodong Liu & Chunhui Zhao
Key Laboratory of Advanced Marine Communication and Information Technology, Ministry of Industry and Information Technology, Harbin, China
Lu Wang & Chunhui Zhao

Authors

Yuhang Qi
View author publications
You can also search for this author in PubMed Google Scholar
Lu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Haodong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunhui Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lu Wang .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Harbin Engineering University, Harbin, China
Tong Wang
Tongji University, Shanghai, China
Hongfei Fan
Guangdong University of Technology, Guangzhou, China
Dongning Liu
Tongji University, Shanghai, China
Bowen Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qi, Y., Wang, L., Li, K., Liu, H., Zhao, C. (2024). Latent Diffusion Model-Based T2T-ViT for SAR Ship Classification. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2013. Springer, Singapore. https://doi.org/10.1007/978-981-99-9640-7_22

Download citation

DOI: https://doi.org/10.1007/978-981-99-9640-7_22
Published: 05 January 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9639-1
Online ISBN: 978-981-99-9640-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Latent Diffusion Model-Based T2T-ViT for SAR Ship Classification

Abstract

Access this chapter

Similar content being viewed by others

A Survey of Ship Detection and Classification Techniques

YOLOv3 Remote Sensing SAR Ship Image Detection

Efficient ship detection in sar images with dynamic feature smoothing and visual module using omni-dimensional dynamic large-scale convolution

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Latent Diffusion Model-Based T2T-ViT for SAR Ship Classification

Abstract

Access this chapter

Similar content being viewed by others

A Survey of Ship Detection and Classification Techniques

YOLOv3 Remote Sensing SAR Ship Image Detection

Efficient ship detection in sar images with dynamic feature smoothing and visual module using omni-dimensional dynamic large-scale convolution

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation