skip to main content
10.1145/3638682.3638684acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvsipConference Proceedingsconference-collections
research-article

A Lightweight Anchor-Free Detector Using Cross-Dimensional Interactive Feature Balance for SAR Ship Detection

Published: 22 May 2024 Publication History

Abstract

In recent years, the deep convolutional neural network (CNN) model has rapidly developed, and it is widely used in SAR remote sensing target detection. Many methods utilize preset anchor boxes for target classification and boundary box coordinate regression. However, these methods encounter challenges when deployed to edge devices. Firstly, in SAR images, ships are sparsely and unevenly distributed, rendering most anchor boxes redundant. Secondly, the target detection model relying on anchor boxes demands extensive computation during anchor box generation, typically running on hardware-rich environments. Nevertheless, edge devices lack the necessary hardware environment to meet the deployment requirements of the CNN model. To address these issues, this paper proposes a lightweight anchor-free detector called Cross-Dimensional Interactive Feature Balance (CDIFB-Net). We adopt a key point strategy to predict the bounding box, eliminating the reliance on anchors. CDIFB-Net employs a partial convolution (PConv) based model as a lightweight feature extraction network to tackle the model's lightweight problem. Additionally, considering the characteristics of large target scale differences and unbalanced target distribution in SAR ship images, and to reduce the precision decline problem caused by lightweight, this paper introduces a cross-dimensional interactive feature balance pyramid (CDI-FBP), which balances semantic information of different levels through feature pyramid aggregation and averaging. The cross-dimensional interaction module (CDIM) establishes the relationship between the ship target and the background. Experiments conducted on the SAR Ship dataset (SSDD) and High-resolution SAR Image dataset (HRSID), along with inference experiments in different hardware environments, validate the effectiveness of the network for SAR ship detection. The experimental results demonstrate that the proposed CDIFB-Net maintains fast reasoning, low delay, and excellent detection accuracy across various hardware environments.

References

[1]
A. T. Manninen and L. M. H. Ulander, “Forestry parameter retrieval from texture in CARABAS VHF-band SAR images,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 12, pp. 2622–2633, Dec. 2001.
[2]
H. McNairn and B. Brisco, “The application of C-band polarimetric SAR for agriculture: A review,” Can. J. Remote Sens., vol. 30, no. 3, pp. 525–542, 2004.
[3]
M. W. Lang and E. S. Kasischke, “Using C-band synthetic aperture radar data to monitor forested wetland hydrology in Maryland's coastal plain, USA,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 2, pp. 535–546, Feb. 2008.
[4]
Shiqi Huang, Yiting Wang, and Peifeng Su, "A New Synthetical Method of Feature Enhancement and Detection for SAR Image Targets," Journal of Image and Graphics, Vol. 4, No. 2, pp. 73-77, December 2016.
[5]
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
[6]
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, Feb. 2020.
[7]
X. Zhou, D. Wang, and P. Krähenbühl, “Objects as points,” 2019, arXiv:1904.07850.
[8]
Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully convolutional onestage object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 9626–9635.
[9]
T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, and J. Shi, “FoveaBox: Beyond anchor-based object detection,” IEEE Trans. Image Process., vol. 29, pp. 7389–7398, 2020.
[10]
A. G. Howard, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” 2017, arXiv:1704.04861. [Online]. Available: http://arxiv.org/abs/1704.04861
[11]
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, pp. 4510–4520.
[12]
A. Howard, “Searching for MobileNetV3,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul, South Korea, Oct. 2019, pp. 1314–1324.
[13]
X. Zhang, X. Zhou, M. Lin, and J. Sun, “ShuffleNet: An extremely efficient convolutional neural network for mobile devices,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018, pp. 6848–6856.
[14]
N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet V2: Practical guidelines for efficient CNN architecture design,” in Proc. Eur. Conf. Comput. Vis., Munich, Germany, Sep. 2018, pp. 122–138.
[15]
K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, “GhostNet: More features from cheap operations,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 1580–1589.
[16]
Laurent Sifre and St´ephane Mallat. Rigid-motion scattering for texture classification. arXiv preprint arXiv:1403.1687, 2014. 1, 3
[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012. 1, 3, 10
[18]
Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Lu Yuan, Zicheng Liu, Lei Zhang, and Nuno Vasconcelos. Micronet: Improving image recognition with extremely low flops. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 468–477, 2021. 1, 3
[19]
J. Chen et al., “Run, don't walk: Chasing higher FLOPS for faster neural networks,” 2023.
[20]
Y. Cao, J. Xu, S. Lin, F. Wei, and H. Hu, “GCNet: Non-local networks meet squeeze-excitation networks and beyond,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop (ICCVW), Oct. 2019, pp. 1971–1980.
[21]
J. Pang, K. Chen, J. Shi, H. Feng, W. Ouyang, and D. Lin, “Libra R-CNN: Towards balanced learning for object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 821–830.
[22]
MISRA D, NALAMADA T, ARASANIPALAI A U, Rotate to Attend: Convolutional Triplet Attention Module[C/OL]//2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA. 2021.
[23]
K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “CenterNet: Keypoint triplets for object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 6568–6577.
[24]
T. Zhang, “SAR ship detection dataset (SSDD): Official release and comprehensive data analysis,” Remote Sens., vol. 13, no. 18, 2021, Art. no. 3690.
[25]
S. Wei, X. Zeng, Q. Qu, M. Wang, H. Su, and J. Shi, “HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation,” IEEE Access, vol. 8, pp. 120234–120254, 2020.
[26]
BAI L, YAO C, YE Z, Feature Enhancement Pyramid and Shallow Feature Reconstruction Network for SAR Ship Detection[J].
[27]
J. Pang, K. Chen, J. Shi, H. Feng, W. Ouyang, and D. Lin, “Libra R-CNN: Towards balanced learning for object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 821–830.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
VSIP '23: Proceedings of the 2023 5th International Conference on Video, Signal and Image Processing
November 2023
237 pages
ISBN:9798400709272
DOI:10.1145/3638682
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CDI-FBP
  2. Convolutional neural network
  3. lightweight
  4. ship detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

VSIP 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 24
    Total Downloads
  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)3
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media