Abstract
It is of great significance to realize high-precision detection of golden pomfret for intelligent management of fishery farming. Nevertheless, the highly variable size of the objectives and the degree of overlap between objectives make optimization of the algorithm challenge. To solve the problems mentioned above, we propose a golden pomfret detection algorithm that combines the improved transformer and the YOLOv5 framework to surpass not only the canonical transformer, but also the high-performance convolutional modules. The specific methods are designed as follows: (1) On the transformer frame, this paper designs a transformer with a progressively increasing number of cascaded tokens, that aims to improve detection accuracy by adaptively learning grid parameters based on the size of the golden pomfret in each image. To achieve a high-performance result, the large kernel convolution is included between the input image and feature space mapping. (2) Based on YOLOv5, we redesigned the prediction head to address different sizes of golden pomfret detection. Then, we replace the original prediction heads with deformable prediction heads to further improve network performance and training efficiency through fine-grained feature mapping of golden pomfret. In particular, the deformable convolution uses a novel generalized linear interpolation algorithm to reduce detection errors. (3) Considering the robustness of the network, we introduce the bags of useful strategies such as data augmentation and polynomial interpolation. Experimental results in the golden pomfret test set showed that the mAP is better than the original YOLOv5 network by 22.59%. Therefore, our algorithm can effectively detect golden pomfret in complex ocean scenes.
Similar content being viewed by others
Data availability
The golden pomfret datasets generated during the current study are available from the corresponding author on reasonable request.
References
Fiorella, K.J., Okronipa, H., Baker, K., Heilpern, S.: Contemporary aquaculture: implications for human nutrition. Curr. Opin. Biotechnol. 70, 83–90 (2021). https://doi.org/10.1016/j.copbio.2020.11.014
Liu, Y.-M., Fu, Y.-W., Hou, T.-L., Liu, H.-R., Feng, J., Zhang, Q.-Z.: Neobenedenia girellae (Monogenea) infection on cultured golden pompano Trachinotus ovatus in Zhanjiang, China. Aquaculture 548, 737669 (2022). https://doi.org/10.1016/j.aquaculture.2021.737669
Jiang, Q., Bhattarai, N., Pahlow, M., Xu, Z.: Environmental sustainability and footprints of global aquaculture. Resour. Conserv. Recycl. 180, 106183 (2022). https://doi.org/10.1016/j.resconrec.2022.106183
Schellewald, C., Stahl, A., Kelasidi, E.: Vision-based pose estimation for autonomous operations in aquacultural fish farms. IFAC PapersOnLine 54(16), 438–443 (2021). https://doi.org/10.1016/j.ifacol.2021.10.128
Zhou, C., Xu, D., Chen, L., Zhang, S., Sun, C., Yang, X., Wang, Y.: Evaluation of fish feeding intensity in aquaculture using a convolutional neural network and machine vision. Aquaculture 507, 457–465 (2019). https://doi.org/10.1016/j.aquaculture.2019.04.056
Zhou, C., Lin, K., Xu, D., Chen, L., Guo, Q., Sun, C., Yang, X.: Near infrared computer vision and neuro-fuzzy model-based feeding decision system for fish in aquaculture. Comput. Electron. Agric. 146, 114–124 (2018). https://doi.org/10.1016/j.compag.2018.02.006
Li, D., Wang, G., Du, L., Zheng, Y., Wang, Z.: Recent advances in intelligent recognition methods for fish stress behavior. Aquacult. Eng. 96, 102222 (2022). https://doi.org/10.1016/j.aquaeng.2021.102222
Liu, Z., Li, X., Fan, L., Lu, H., Liu, L., Liu, Y.: Measuring feeding activity of fish in RAS using computer vision. Aquacult. Eng. 60, 20–27 (2014). https://doi.org/10.1016/j.aquaeng.2014.03.005
Taheri-Garavand, A., Fatahi, S., Banan, A., Makino, Y.: Real-time nondestructive monitoring of Common Carp Fish freshness using robust vision-based intelligent modeling approaches. Comput. Electron. Agric. 159, 16–27 (2019). https://doi.org/10.1016/j.compag.2019.02.023
Dowlati, M., de la Guardia, M., Dowlati, M., Mohtasebi, S.S.: Application of machine-vision techniques to fish-quality assessment. TrAC Trends Anal. Chem. 40, 168–179 (2012). https://doi.org/10.1016/j.trac.2012.07.011
Mathias, A., Dhanalakshmi, S., Kumar, R., Narayanamoorthi, R.: Underwater object detection based on bi-dimensional empirical mode decomposition and Gaussian Mixture Model approach. Ecol. Inform. 66, 101469 (2021). https://doi.org/10.1016/j.ecoinf.2021.101469
Chai, E., Ta, L., Ma, Z., Zhi, M.: ERF-YOLO: a YOLO algorithm compatible with fewer parameters and higher accuracy. Image Vis. Comput. 116, 104317 (2021). https://doi.org/10.1016/j.imavis.2021.104317
Roy, A.M., Bhaduri, J.: Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4. Comput. Electron. Agric. 193, 106694 (2022). https://doi.org/10.1016/j.compag.2022.106694
Li, Z., Li, Y., Yang, Y., Guo, R., Yang, J., Yue, J., Wang, Y.: A high-precision detection method of hydroponic lettuce seedlings status based on improved Faster RCNN. Comput. Electron. Agric. 182, 106054 (2021). https://doi.org/10.1016/j.compag.2021.106054
Hou, R., Chen, J., Feng, Y., Liu, S., He, S., Zhou, Z.: Contrastive-weighted self-supervised model for long-tailed data classification with vision transformer augmented. Mech. Syst. Signal Process. 177, 109174 (2022). https://doi.org/10.1016/j.ymssp.2022.109174
Gao, L., Zhang, J., Yang, C., Zhou, Y.: Cas-VSwin transformer: a variant swin transformer for surface-defect detection. Comput. Ind. 140, 103689 (2022). https://doi.org/10.1016/j.compind.2022.103689
Ben, G., Alaaeldin, E.N., Hugo, T., Pierre, S., Armand, J., Hervé, J., Matthijs, D.: Levit: a vision transformer in convnet’s clothing for faster inference. arXiv preprint arXiv: 2104.01136 (2021). https://doi.org/10.48550/arXiv.2104.01136
Dai, Z., Liu, H., Le, Q.V., Tan, M.: Coatnet: marrying convolution and attention for all data sizes. arXiv preprint arXiv: 2106.04803 (2021). https://doi.org/10.48550/arXiv.2106.04803
Yong, H.L., Kassam, S.A.: Generalized median filtering and related nonlinear filtering techniques. IEEE Trans. Acoust. Speech Signal Process. 33(3), 672–683 (1985). https://doi.org/10.1109/TASSP.1985.1164591
Yun, S., Han, D., Oh, S.J., Yoo, Y., Choe, J.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, South Korea (2019). https://doi.org/10.1109/ICCV.2019.00612
Goldman, R.: CHAPTER 2—Lagrange interpolation and Neville’s algorithm. In: Pyramid Algorithms, pp. 47–117. Morgan Kaufmann, San Francisco (2003)
Ding, X., Zhang, X., Zhou, Y., Han, J., Ding, G., Sun, J.: Scaling up your kernels to 31 × 31: revisiting large kernel design in CNNs. arXiv preprint arXiv: 2203.06717 (2022). https://doi.org/10.48550/arXiv.2203.06717
Acknowledgments
The work was financially supported by the Guangdong Interregional Collaborative Fund (No. 2019B1515120017), Guangdong Special Project of Ocean Economic Development (No.011Z21001), Zhanjiang project of Innovation and Entrepreneurship Team “Pilot Program” (No.2020LHJH003), and Zhanjiang Key Laboratory of Modern Marine Fishery Equipment. (No. 2021A05023), and program for scientific research start-up funds of Guangdong Ocean University (No. 060302062106).
Author information
Authors and Affiliations
Contributions
Guoyan Yu provided golden pomfret data sets and project support. Yingtong Luo wrote the main manuscript text and Ruoling Deng proposed an idea of the improvement algorithm.
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yu, G., Luo, Y. & Deng, R. An detection algorithm for golden pomfret based on improved YOLOv5 network. SIViP 17, 1997–2004 (2023). https://doi.org/10.1007/s11760-022-02412-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02412-y