Deformable Feature Pyramid Network for Ship Recognition

Ding, Yao; Zhang, Yichen; Qu, Yanyun; Li, Cuihua

doi:10.1007/978-3-030-00764-5_6

Yao Ding¹⁸,
Yichen Zhang¹⁸,
Yanyun Qu¹⁸ &
…
Cuihua Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11166))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3178 Accesses

Abstract

Ship recognition under complex sea environment and weather condition is a challenging task because the ship appearances change greatly, especially under large geometric transformation. The original feature pyramid network (FPN) [4] does not achieve good performance if it is implemented on ship detection directly, because it uses the fixed geometric structures in their building modules. In this paper, a deformable feature pyramid network is designed for ship recognition. The contributions are three folds: (1) We change the fixed geometric structure model of the original feature pyramid network to deformable geometric structure model and use the dilated convolution [12] instead of the original convolution. Correspondingly, deformable position-sensitive RoI pooling is used instead of the fixed geometric RoI pooling in the RoI-wise subnetwork. (2) The focal loss function [6] replaces the original mixed cross-entropy loss function. (3) Decay-NMS, a new post-processing method, is designed in this paper to improve the detection accuracy. The experimental results demonstrate the effectiveness and efficiency of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. arXiv:1612.03144 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp. 770–778 (2016)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), Venice, pp. 2999–3007 (2017)
Google Scholar
Liu, W., Anguelov, D., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Google Scholar
Neubeck, A., Gool, L.V.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR 2006), Hong Kong, pp. 850–855 (2006)
Google Scholar
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp. 761–769 (2016)
Google Scholar
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., et al.: Deformable convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), Venice, pp. 764–773 (2017)
Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: International Conference on Learning Representations (2016)
Google Scholar
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS—improving object detection with one line of code. In: IEEE International Conference on Computer Vision (ICCV), Venice, pp. 5562–5570 (2017)
Google Scholar
http://www.datafountain.cn/projects/2017CCF/
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp. 636–644 (2017)
Google Scholar
Li, Y., He, K., Sun, J., et al.: R-FCN: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Xiamen University, Xiamen, China
Yao Ding, Yichen Zhang, Yanyun Qu & Cuihua Li

Authors

Yao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Yichen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanyun Qu
View author publications
You can also search for this author in PubMed Google Scholar
Cuihua Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanyun Qu .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, Y., Zhang, Y., Qu, Y., Li, C. (2018). Deformable Feature Pyramid Network for Ship Recognition. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-00764-5_6
Published: 18 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00763-8
Online ISBN: 978-3-030-00764-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics