Skip to main content

Advertisement

Log in

Learning to transfer attention in multi-level features for rotated ship detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Multi-scale object detection is one of the focuses of object detection, which is particularly vital for ship detection. In order to achieve the desired effects, most advanced Convolutional Neural Network-based detectors enumerate and make inferences over multi-resolution feature maps. However, the existing methods bring two critical problems: (1) Over-fitted anchor settings and supervisions for object scales will restrict the generalized performance of the algorithm. (2) Similar multi-resolution prediction branches insulate the feature space and prevent learning from branches at different levels. Drawing on the human cognitive process, this paper proposes a novel structure for multi-scale rotated ship detection called the Feature Attention Transfer module, which generates and transfers attention in multi-level feature maps to instruct each prediction branch to focus on the features that are not well extracted in other branches. Accordingly, a customized supervision method called “Inclusion–Exclusion Learning” is proposed for associative learning based on the prediction results on multi-scale branches. We employ an anchor-free rotated ship detection framework to verify the proposed module. Extensive experiments are conducted to demonstrate the effectiveness of the proposed algorithm, called SKFat, on three optical remote sensing image datasets. Experimental results show that the proposed modules improve the multi-resolution detection framework while introducing negligible inference overhead. The best result of the proposed algorithm achieves the state-of-the-art average precision while reaching a high inference speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Huo W, Huang Y, Pei J, Zhang Q, Gu Q, Yang J (2018) Ship detection from ocean sar image based on local contrast variance weighted information entropy. Sensors 18(4):1196

    Article  Google Scholar 

  2. Pelich R, Longépé N, Mercier G, Hajduch G, Garello R (2014) Ais-based evaluation of target detectors and sar sensors characteristics for maritime surveillance. IEEE J Sel Top Appl Earth Observ Remote Sens 8(8):3892–3901

    Article  Google Scholar 

  3. Schloenhardt A (2003) Migrant smuggling: illegal migration and organised crime in Australia and the Asia Pacific Region. Leiden, The Netherlands: Brill–Nijhoff, [Online]. Available: https://brill.com/view/title/8821

  4. Marino A, Sugimoto M, Ouchi K, Hajnsek I (2013) Validating a notch filter for detection of targets at sea with alos-palsar data: Tokyo bay. IEEE J Sel Top Appl Earth Observ Remote Sens 7(12):4907–4918

    Article  Google Scholar 

  5. Cai Z, Vasconcelos N (2019) Cascade r-cnn: high quality object detection and instance segmentation. IEEE Trans Pattern Anal Mach Intell 43:1483–98

    Article  Google Scholar 

  6. Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9259–9266

  7. Qiao S, Chen L-C, Yuille A (2020) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. Preprint arXiv:2006.02334

  8. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  9. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37

  10. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  11. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. Preprint arXiv:1904.07850

  12. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  13. Cheng B, Wei Y, Shi H, Feris R, Xiong J, Huang T (2018) Revisiting rcnn: on awakening the classification power of faster rcnn. In: Proceedings of the European conference on computer vision (ECCV), pp 453–468

  14. Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 840–849

  15. Singh B, Davis LS (2018) An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3578–3587

  16. Chen X, Fang H, Lin T-Y, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015) Microsoft coco captions: data collection and evaluation server. Preprint arXiv:1504.00325

  17. Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983

  18. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  19. Leng J, Liu Y, Chen S (2019) Context-aware attention network for image recognition. Neural Comput Appl 31(12):9295–9305

    Article  Google Scholar 

  20. Cui Z, Leng J, Liu Y, Zhang T, Quan P, Zhao W (2021) Sknet: detecting rotated ships as keypoints in optical remote sensing images. IEEE Trans Geosci Remote Sens 59:8826–40

    Article  Google Scholar 

  21. Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  22. Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20(11):3111–3122

    Article  Google Scholar 

  23. Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2cnn: rotational region cnn for orientation robust scene text detection. Preprint arXiv:1706.09579

  24. Ding J, Xue N, Long Y, Xia G-S, Lu Q (2018) Learning roi transformer for detecting oriented objects in aerial images. Preprint arXiv:1812.00155

  25. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212

  26. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  27. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  28. Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750

  29. Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 5551–5560

  30. Lin Y, Feng P, Guan J (2019) Ienet: interacting embranchment one stage anchor free detector for orientation aerial object detection. Preprint arXiv:1912.00969

  31. Cheng M-M, Zhang Z, Lin W-Y, Torr P (2014) Bing: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3286–3293

  32. Szegedy C, Reed S, Erhan D, Anguelov D, Ioffe S (2014) Scalable, high-quality object detection. Preprint arXiv:1412.1441

  33. Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154

  34. Wang J, Chen K, Yang S, Loy CC, Lin D (2019) Region proposal by guided anchoring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2965–2974

  35. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision. Springer, pp 354–370

  36. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems, vol 2, pp 2204–2212

  37. Bahdanau D, Cho KH, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015

  38. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

  39. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  40. Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: Proceedings of the 28th international conference on neural information processing systems,vol 2, pp 2017–2025

  41. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  42. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision. Springer, pp 483–499

  43. Ross SM (2014) Introduction to probability models. Academic press

  44. Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06). IEEE, vol 3, pp 850–855

  45. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6569–6578

  46. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  47. Liu Z, Liu Y, Weng L, Yang Y (2017) A high resolution optical satellite image dataset for ship recognition and some new baselines. In: International conference on pattern recognition applications & methods

  48. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556

  49. Zhang Z, Guo W, Zhu S, Yu W (2018) Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks. IEEE Geosci Remote Sens Lett 15:1–5

    Article  Google Scholar 

  50. Liao M, Zhu Z, Shi B, Xia G, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society, pp 5909–5918

  51. Yang X, Liu Q, Yan J, Li A (2019) R3det: refined single-stage detector with feature refinement for rotating object. Preprint arXiv:1908.05612

  52. Liu L, Bai Y, Li Y (2020) Locality-aware rotated ship detection in high-resolution remote sensing imagery based on multi-scale convolutional network. Preprint arXiv:2007.12326

  53. Koo J, Seo J, Jeon S, Choe J, Jeon T (2018) Rbox-cnn: rotated bounding box based cnn for ship detection in remote sensing image. In: Proceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems, pp 420–423

  54. Wang J, Yang W, Li H-C, Zhang H, Xia G-S (2020) Learning center probability map for detecting objects in aerial images. IEEE Trans Geosci Remote Sens 59:4307–23

    Article  Google Scholar 

  55. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

Download references

Funding

This work was supported in part by the Natural Science Foundation of China under Grant 71671178, in part by the Equipment Advance Research Fund 6142502180101. It is also supported by the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Wang.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, Z., Liu, Y., Zhao, W. et al. Learning to transfer attention in multi-level features for rotated ship detection. Neural Comput & Applic 34, 19831–19844 (2022). https://doi.org/10.1007/s00521-022-07491-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07491-z

Keywords

Navigation