Abstract
Ship detection is significant for monitoring ports, especially contributing to the safe driving of Unmanned Surface Vehicle (USV). However, recent ship detection based on deep learning lacks complete ship datasets and uses the classification score as the ranking basis, which harms their performance. To address the problems, we present a one-stage localization estimation detector (LEDet) with ship-customized data augmentation. Specifically, we integrate the localization quality estimation into the classification branch as a soft label localization score. We further apply ship-customized data augmentation named “cutting-transform-paste” to expand ship datasets without manual annotation. Hence, a large number of diverse ship datasets can be created. Extensive experiments show that our LEDet consistently exceeds the strong baseline by 8.0% COCO-style Average Precision (AP) with ResNet-50. It significantly improves the ship detection performance.
Similar content being viewed by others
References
Bochkovskiy, A., Wang, C. Y., Liao, H. Y. M.: Yolov4: Optimal speed and accuracy of object detection. Comput. Sci. (2020). arXiv preprint arXiv:2004.10934
Chen, L., Fukun, B., et al.: An intensity-space domain CFAR method for ship detection in HR SAR images. IEEE Geosci. Remote Sens. Lett. 14(4), 529–533 (2017). https://doi.org/10.1109/lgrs.2017.2654450
Chen, K., Wang, J., Pang, J., et al.: MMDetection: Open mmlab detection toolbox and benchmark. Comput. Sci. (2019). arXiv preprint arXiv:1906.07155
Chen, Y., Li, Y., Kong, T., et al.: Scale-aware automatic augmentation for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9563–9572 (2021a)
Chen, Z., Ouyang, W., Liu, T., et al.: A shape transformation-based dataset augmentation framework for pedestrian detection. Int. J. Comput. vis. 129(4), 1121–1138 (2021b)
Cubuk, E.D., et al.: Auto augment: Learning augmentation strategies from data. IEEE/CVF Conf. Comput. vis. Pattern Recognit. (CVPR) (2019). https://doi.org/10.1109/cvpr.2019.00020
Cubuk, E.D., Zoph, B., Shlens, J., et al.: Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Duan, K., Bai, S., Xie, L., et al.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019)
Fefilatyev, S., Goldgof, D., Shreve, M., et al.: Detection and tracking of ships in open sea with rapidly moving buoy-mounted camera system. Ocean Eng. 54, 1–12 (2012). https://doi.org/10.1016/j.oceaneng.2012.06.028
Fingas, M.F., Brown, C.E.: Review of ship detection from airborne platforms. Can. J. Remote. Sens. (2014). https://doi.org/10.1080/07038992.2001.10854880
Girshick, R.: Fast R-CNN. Comput. Sci. (2015). https://doi.org/10.1109/iccv.2015.169
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Comput. Soc. (2013). https://doi.org/10.1109/cvpr.2014.81
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. IEEE Conf. Comput. vis. Pattern Recognit. (CVPR) (2016). https://doi.org/10.1109/cvpr.2016.90
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. IEEE (2017). https://doi.org/10.1109/iccv.2017.322
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. IEEE/CVF Conf. Comput. vis. Pattern Recognit. (CVPR) 2020, 9726–9735 (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Jia, D., Wei, D., Socher, R., et al.: ImageNet: A large-scale hierarchical image database. Proc. of IEEE Comput. vis. Pattern Recognit. (2009). https://doi.org/10.1109/cvprw.2009.5206848
Jiang, B., Luo, R., Mao, J., et al.: Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 784–799. Springer, Cham (2018)
Kong, T., Sun, F., Liu, H., et al.: Foveabox: Beyound anchor-based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. (2012). https://doi.org/10.1145/3065386
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750 (2018)
Lecun, Y., Bottou, L.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Li, X., Wang, W., Wu, L., et al.: Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural. Inf. Process. Syst. 33, 21002–21012 (2020)
Lin, T.Y., Maire, M., Belongie, S., et al.: Microsoft COCO: common objects in context. In: European conference on computer vision. Springer International Publishing, Berlin (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Lin, T.Y., Dollar, P., Girshick, R., et al.: Feature pyramid networks for object detection. IEEE Conf. Comput. vis. Pattern Recognit. (CVPR) (2017). https://doi.org/10.1109/cvpr.2017.106
Neubeck, A., Gool, L.: Efficient non-maximum suppression. Int. Conf. Pattern Recognit. (2006). https://doi.org/10.1109/icpr.2006.479
Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. IEEE (2017). https://doi.org/10.1109/cvpr.2017.690
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. comput. Sci. (2018). arXiv preprint arXiv:1804.02767
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031
Rezatofighi, H., Tsoi, N., Gwak, J.Y., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. IEEE/CVF Conf. Comput. vis. Pattern Recognit. (CVPR) (2019). https://doi.org/10.1109/cvpr.2019.00075
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. comput. Sci. (2014). arXiv preprint arXiv:1409.1556
Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9 (2015)
Tang, Y., Li, B., Liu, M., et al.: Autopedestrian: an automatic data augmentation and loss function search scheme for pedestrian detection. IEEE Trans. Image Process. 30, 8483–8496 (2021)
Tian, Z., Shen, C., Chen, H., et al.: FCOS: Fully convolutional one-stage object detection. IEEE/CVF Int. Conf. Comput. vis. (ICCV) (2019). https://doi.org/10.1109/iccv.2019.00972
Wu, S., Li, X., Wang, X.: IoU-aware single-stage object detector for accurate localization. Image vis. Comput. 97, 103911 (2020)
Wu, S., Yang, J., Wang, X., et al.: Iou-balanced loss functions for single-stage object detection. Pattern Recognit. Lett. 156, 96–103 (2022)
Zhang, Y., Li, Q.Z., Zang, F.N.: Ship detection for visual maritime surveillance from non-stationary platforms. Ocean Eng. 141, 53–63 (2017)
Zhi, Z., Ji, K., Xing, X., et al.: Ship surveillance by integration of space-borne SAR and AIS–review of current research. J. Navig. 67(1), 177–189 (2014). https://doi.org/10.1017/s0373463313000659
Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 850–859 (2019)
Zhu, C., Chen, F., Shen, Z., et al.: Soft anchor-point object detection. In: European conference on computer vision, pp. 91–107. Springer, Cham (2020)
Zoph, B., Cubuk, E.D., Ghiasi, G., et al.: Learning data augmentation strategies for object detection. In: European conference on computer vision, pp. 566–583. Springer, Cham (2020)
Acknowledgements
This work was supported by the National Key Research and Development Program of China (No. 2020YFC1521700), Major projects of National Natural Science Foundation of China: 61991415; The Joint Founds of National Natural Science Foundation of China): U1813217; the National Natural Science Foundation of China (No. 51904181). Shanghai Municipal Natural Science Foundation (21ZR1423300).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, Y., Lv, J., Wang, Y. et al. LEDet: localization estimation detector with data augmentation for ship detection based on unmanned surface vehicle. Int J Intell Robot Appl 6, 216–230 (2022). https://doi.org/10.1007/s41315-022-00238-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41315-022-00238-y