Multi-scale aggregation feature pyramid with cornerness for underwater object detection

Li, Xinbin; Yu, Haifeng; Chen, Haiyang

doi:10.1007/s00371-023-02849-3

Multi-scale aggregation feature pyramid with cornerness for underwater object detection

Original article
Published: 09 April 2023

Volume 40, pages 1299–1310, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

362 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Underwater object detection is a fascinating but challengeable subject in computer vision. Features are difficult to extract due to the color cast and blur of underwater images. Moreover, given the small scale of the underwater object, some details will be lost after several layers of convolution. Therefore, a multi-scale aggregation feature pyramid network is proposed to integrate multi-scale features and improve underwater object detection performance. Specifically, a lightweight and efficient network is used to extract the basic features. A special subnet is designed to improve the feature extraction capability of the backbone network to enrich the detailed features of small underwater objects. In addition, a multi-scale feature pyramid is proposed to enrich feature map. Each feature map enhances contextual information through a combination of up-sampling and down-sampling. The centerness strategy of the fully convolutional one-stage object detection head is improved by adding corner point regression to enhance the recall rate of small objects. Generalized intersection over union (GIoU) instead of IoU can better reflect the degree of coincidence between the actual box and the predicted box. Therefore, the regression loss is changed to GIoU loss. This paper evaluates the network on the underwater image dataset and obtains 78.90% mAP. Meanwhile, the experiment on the PASCAL VOC datasets is conducted and gets 84.3% mAP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Data availability

The data that support the findings of this study are available from Peng Cheng Laboratory. Restrictions apply to the availability of these data, which were used under license for this study. Data are available at https://aistudio.baidu.com/aistudio/datasetdetail/25886 with the permission of Peng Cheng Laboratory.

References

Han, M., et al.: A review on intelligence dehazing and color restoration for underwater images. IEEE Trans. Syst. Man Cybern. Syst. 50(5), 1820–1832 (2018)
Article Google Scholar
Wang, Jing, et al.: CA-GAN: class-condition attention GAN for underwater image enhancement. IEEE Access 8, 130719–130728 (2020)
Article Google Scholar
Wang, Xinhua, et al.: Underwater object recognition based on deep encoding-decoding network. J. Ocean Univ. China 18(2), 376–382 (2019)
Article Google Scholar
Chen, L., et al.: Underwater object detection using invert multi-class adaboost with deep learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE (2020)
Wei, Jian, et al.: Enhanced object detection with deep convolutional neural networks for advanced driving assistance. IEEE Trans. Intell. Transp. Syst. 21(4), 1572–1583 (2019)
Article Google Scholar
Dhillon, Anamika, Verma, Gyanendra K.: Convolutional neural network: a review of models, methodologies and applications to object detection. Progress Artif. Intell. 9(2), 85–112 (2020)
Article Google Scholar
Li, H., et al.: Pyramid attention network for semantic segmentation. arXiv:1805.10180 (2018)
Ammari, Habib, et al.: Reconstructing fine details of small objects by using plasmonic spectroscopic data. SIAM J. Imag. Sci. 11(1), 1–23 (2018)
Article MathSciNet Google Scholar
Liu, W., et al.: Ssd: single shot multibox detector. In: European conference on computer vision. Springer, Cham (2016)
Lin, T.-Y., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Lee, Y., et al.: An energy and GPU-computation efficient backbone network for real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Rezatofighi, H., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
He, Kaiming, et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern. Anal. Mach. Intell. 39(6):1137–1149 (2017)
Shrivastava, A., Abhinav, G., Ross, G.: Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Singh, B., Larry, S.D. An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Fu, C.-Y., et al.: Dssd: deconvolutional single shot detector. arXiv:1701.06659 (2017)
Redmon, J., Ali, F.: Yolov3: an incremental improvement. arXiv:1804.02767 (2018)
Zhao, Q., et al.: M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence vol. 33. No. 01. (2019)
Tian, Z., et al.: Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Xu, Fengqiang, et al.: Scale-aware feature pyramid architecture for marine object detection. Neural Comput. Appl. 33(8), 3637–3653 (2021)
Article Google Scholar
Ghiasi, G., Tsung-Yi, L., Quoc, V.L.: Nas-fpn: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Zheng, Z., et al.: Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence vol. 34. No. 07 (2020)
Chen, Z., et al.: Piou loss: towards accurate oriented object detection in complex environments. In: European Conference on Computer Vision. Springer, Cham (2020)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Huang, G., et al.: Condensenet: an efficient densenet using learned group convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Bochkovskiy, A., Chien-Yao, W., Hong-Yuan, M.L.: Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)
Duan, K., et al.: Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Rodner, E., Simon, M., Fisher, R., Denzler, J.: Fine-grained recognition in the noisy wild: sensitivity analysis of convolutional neural networks approaches. In: Procedings of the British Machine Vision Conference 2016. British Machine Vision Association (2016)

Download references

Acknowledgements

This work was supported in part by the Hebei Natural Science Foundation, China under Grant F2020203037, and F2022203025, in part by the National Natural Science Foundation of China under Grant 61873224, Grant 62271437, and Grant 62003295, in part by the Science and Technology Research Project of Universities in Hebei, China under Grant QN2020301.

Author information

Authors and Affiliations

Institute of Electrical Engineering, Yanshan University, Qinhuangdao, 066004, Hebei Province, China
Xinbin Li & Haiyang Chen
College of Electrical Engineering, North China University of Science and Technology, Tangshan, 063210, China
Haifeng Yu

Authors

Xinbin Li
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haifeng Yu.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, X., Yu, H. & Chen, H. Multi-scale aggregation feature pyramid with cornerness for underwater object detection. Vis Comput 40, 1299–1310 (2024). https://doi.org/10.1007/s00371-023-02849-3

Download citation

Accepted: 09 March 2023
Published: 09 April 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02849-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale aggregation feature pyramid with cornerness for underwater object detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-scale aggregation feature pyramid with cornerness for underwater object detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation