SSoB: searching a scene-oriented architecture for underwater object detection

Yuan, Wanqi; Fu, Chenping; Liu, Risheng; Fan, Xin

doi:10.1007/s00371-022-02654-4

SSoB: searching a scene-oriented architecture for underwater object detection

Original article
Published: 10 September 2022

Volume 39, pages 5199–5208, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Wanqi Yuan¹,
Chenping Fu²,
Risheng Liu^1,3,4 &
…
Xin Fan^1,3

567 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Underwater object detection (UOD) suffers from low detection accuracy because of environmental degradations, such as haze-like effects, color distortions, and imaging noises. Therefore, we commit to resolving the issue of object detection with compounded environmental degradations that greatly challenges existing deep learning-based detectors. We propose a neural architecture search -based deep learning network to realize the UOD task, which can automatically discover the scene-oriented feature representation. Our network is accomplished through a unified macro-detector and a novel mixed anti-aliasing block (MAaB)-based search space. The macro-detector targets to learn intrinsic feature representations automatically from underwater images containing various environmental degradations and complete the subsequent detection tasks. The novel MAaB-based search space is proposed toward complex underwater scenes. The candidate operator MAaB has multiple kernels and anti-aliased convolutions in a single block for boosting the contextual representation capacity and the robustness of degraded factors. Finally, we use the differential search strategy guides the whole learning process to obtain the scene-friendly results. Extensive experiments demonstrate that our method outperforms the state-of-the-art detectors by a large margin. More importantly, in cases where environmental degradation is severely disturbed, our method is also superior to other popular detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight Transformers make strong encoders for underwater object detection

Article 28 December 2022

Dual Refinement Underwater Object Detection Network

Underwater Object Detection Using Restructured SSD

Notes

References

Pang, Y., Wu, C., Wu, H., Yu, X.: Over-sampling strategy-based class-imbalanced salient object detection and its application in underwater scene. Vis, Comput (2022)
Google Scholar
Mhala, N.C., Pais, A.R.: A secure visual secret sharing (vss) scheme with cnn-based image enhancement for underwater images. Vis. Comput. 37, 2097 (2021)
Article Google Scholar
Liang, P., Dong, P., Wang, F., Ma, P., Bai, J., Wang, B., Li, C.: Learning to remove sandstorm for image enhancement. Vis, Comput (2022)
Google Scholar
Lin, R., Liu, J., Liu, R., Fan, X.: Global structure-guided learning framework for underwater image enhancement. Vis, Comput (2021)
Google Scholar
Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H.: Cbnet: a novel composite backbone network architecture for object detection. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI. pp. 11 653–11 (2020)
Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv2: inverted residuals and linear bottlenecks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. (2018), pp. 4510–4520
Wu, B., Dai, X., Zhang, P., Wang, Y., Sun, F., Wu, Y., Tian, Y., Vajda, P., Jia, Y., Keutzer, K.: Fbnet: hardware-aware efficient convnet design via differentiable neural architecture search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 10 734–10 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 770–778 (2016)
Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 5987–5995. (2017)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding YOLO series in 2021. In: CoRR. vol. abs/2107.08430, (2021). [Online]. Available: https://arxiv.org/abs/2107.08430
Fan, B., Chen, W., Cong, Y., Tian, J.: Dual refinement underwater object detection network. In: Computer Vision - ECCV - 16th European Conference, Glasgow, UK, August 23–28,: Proceedings. Part XX 12365(2020), 275–291 (2020)
Lin, W., Zhong, J., Liu, S., Li, T.H., Li, G.: ROIMIX: proposal-fusion among multiple images for underwater object detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. pp. 2588–2592 (2020)
Jiang, L., Wang, Y., Jia, Q., Xu, S., Liu, Y., Fan, X., Li, H., Liu, R., Xue, X., Wang, R.: Underwater species detection using channel sharpening attention. In: ACM Multimedia Conference, pp. 4259–4267 (2021)
Liu, C., Wang, Z., Wang, S., Tang, T., Tao, Y., Yang, C., Li, H., Liu, X., Fan, X.: A new dataset, poisson gan and aquanet for underwater object grabbing. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2831–2844 (2022)
Article Google Scholar
Chen, Y., Yang, T., Zhang, X., Meng, G., Xiao, X., Sun, J.: Detnas: backbone search for object detection. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. pp. 6638–6648. (2019)
Ghiasi, G., Lin, T., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 7036–7045. (2019)
Wang, N., Gao, Y., Chen, H., Wang, P., Tian, Z. Shen, C., Zhang, Y.: NAS-FCOS: fast neural architecture search for object detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. pp. 11 940–11. (2020)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: 7th International Conference on Learning Representations, ICLR. (2019)
Guo, J., Han, K., Wang, Y., Zhang, C., Yang, Z., Wu, H., Chen, X., Xu, C.: Hit-detector: Hierarchical trinity architecture search for object detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. pp. 11 402. (2020)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision - ECCV 2016–14th European Conference, Amsterdam, The Netherlands, October 11–14,: Proceedings. Part I 9905(2016), pp. 21–37. (2016)
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020)
Article Google Scholar
Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 840–849. (2019)
Zhang, X., Wan, F., Liu, C., Ji, R., Ye, Q.: Freeanchor: learning to match anchors for visual object detection. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. pp. 147–155. (2019)
Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: Foveabox: Beyound anchor-based object detection. IEEE Trans. Image Process. 29, 7389–7398 (2020)
Article MATH Google Scholar
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, NeurIPS
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 36–944. (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020)
Article Google Scholar
Lu, X., Li, B., Yue, Y., Li, Q., Yan, J.: Grid R-CNN. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 7363–7372. (2019)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 6154–6162. (2018)
Wang, J., Chen, K., Yang, S., Loy, C.C., Lin, D.: Region proposal by guided anchoring. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 2965–2974. (2019)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Computer Vision - ECCV - 16th European Conference, Glasgow, UK, August 23–28. Proceedings, Part I(12346), 213–229 (2020)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows In: IEEE/CVF International Conference on Computer Vision, ICCV. pp. 9992–10 002. (2021)
Wang, W., Xie, E., Li, X., Fan, D., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: IEEE/CVF International Conference on Computer Vision, ICCV. pp. 548–558 (2021)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 580–587 (2014)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. In: 2nd International Conference on Learning Representations, ICLR. (2014)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR. (2015)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. CoRR. vol. abs/1804.02767, (2018). [Online]. Available: http://arxiv.org/abs/1804.02767
Liu, R., Ma, L., Zhang, J., Fan, X., Luo, Z.: Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 10 561–10 570. (2021)
Xu, Y., Xie, L., Zhang, X., Chen, X., Qi, G., Tian, Q., Xiong, H.: PC-DARTS: partial channel connections for memory-efficient architecture search. In: 8th International Conference on Learning Representations, ICLR. (2020)
Ma, L., Jin, D., Liu, R., Fan, X., Luo, Z.: Joint over and under exposures correction by aggregated retinex propagation for image enhancement. IEEE Signal Process. Lett. 27, 1210–1214 (2020)
Article Google Scholar
Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI. S. A. McIlraith and K. Q. Weinberger, Eds., pp. 2787–2794. (2018)
Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L., Fei-Fei, L., Yuille, A.L., Huang, J., Murphy, K., “Progressive neural architecture search,” in Computer Vision - ECCV 2018–15th European Conference, Munich, Germany, September 8–14,: Proceedings. Part I 11205(2018), 19–35 (2018)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 8697–8710. (2018)
Real, E. Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI. pp. 4780–4789. (2019)
Yang, Z., Wang, Y., Chen, X., Shi, B., Xu, C., Xu, C., Tian, Q., Xu, C.: CARS: continuous evolution for efficient neural architecture search. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. pp. 1826–1835.(2020)
Guo, Z., Zhang, X., Mu, H., Heng, W., Liu, Z., Wei, Y., Sun, J.: Single path one-shot neural architecture search with uniform sampling. In: Computer Vision - ECCV - 16th European Conference, Glasgow, UK, August 23–28,: Proceedings. Part XVI 12361(2020), 544–560 (2020)
Xue, C., Yan, J., Yan, R., Chu, S.M., Hu, Y., Lin, Y.: Transferable automl by model sharing over grouped datasets. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR. pp. 9002–9011. (2019)
Du, X., Lin, T., Jin, P., Ghiasi, G., Tan, M., Cui, Y., Le, Q.V., Song, X.: Sinenet: Learning scale-permuted backbone for recognition and localization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR. pp. 11 589–11 598. (2020)
Xu, H., Yao, L., Li, Z., Liang, X., Zhang, W.: Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: IEEE/CVF International Conference on Computer Vision, ICCV. pp. 6648–6657. (2019)
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Li, H., Xiong, P., An, J., Wang, L.: Pyramid attention network for semantic segmentation. In: British Machine Vision Conference, BMVC. p. 285. (2018)
Zhang, R.: Making convolutional networks shift-invariant again. In: Proceedings of the 36th International Conference on Machine Learning, ICML. vol. 97, pp. 7324–7334. (2019)
Tan, M., Le, Q.V.: Mixconv: mixed depthwise convolutional kernels. In: 30th British Machine Vision Conference, BMVC. p. 74. (2019)

Download references

Author information

Authors and Affiliations

International School of Information Science and Engineering, Dalian University of Technology, Dalian, China
Wanqi Yuan, Risheng Liu & Xin Fan
School of Software Technology, Dalian University of Technology, Dalian, China
Chenping Fu
Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China
Risheng Liu & Xin Fan
Peng Cheng Laboratory, Shenzhen, China
Risheng Liu

Authors

Wanqi Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Chenping Fu
View author publications
You can also search for this author in PubMed Google Scholar
Risheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Fan.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled, ”SSoB: Searching a Scene-Oriented Architecture for Underwater Object Detection.”

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yuan, W., Fu, C., Liu, R. et al. SSoB: searching a scene-oriented architecture for underwater object detection. Vis Comput 39, 5199–5208 (2023). https://doi.org/10.1007/s00371-022-02654-4

Download citation

Accepted: 08 August 2022
Published: 10 September 2022
Issue Date: November 2023
DOI: https://doi.org/10.1007/s00371-022-02654-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSoB: searching a scene-oriented architecture for underwater object detection

Abstract

Access this article

Similar content being viewed by others

Lightweight Transformers make strong encoders for underwater object detection

Dual Refinement Underwater Object Detection Network

Underwater Object Detection Using Restructured SSD

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SSoB: searching a scene-oriented architecture for underwater object detection

Abstract

Access this article

Similar content being viewed by others

Lightweight Transformers make strong encoders for underwater object detection

Dual Refinement Underwater Object Detection Network

Underwater Object Detection Using Restructured SSD

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation