Fast detection of cannibalism behavior of juvenile fish based on deep learning

https://doi.org/10.1016/j.compag.2022.107033Get rights and content

Highlights

  • A detection neural network is proposed for the cannibalism behavior of juvenile fish.

  • The network improves the detection accuracy by adding an attention mechanism and changing the feature fusion network.

  • The network improves the detection speed by using lightweight modules.

  • Experiments show that the algorithm can detect the cannibalism behavior in aquaculture scenes in real time.

Abstract

Special behavior detection of fish is an effective way to ensure fish welfare and improve the intelligent level of aquaculture. However, due to the influence of the breeding environment and physiological factors, fish are prone to cannibalism during the juvenile period. In order to effectively monitor the behavior of the fry, this article proposes a real-time detection scheme based on the improved You Only Look Once (YOLO)-v5 to detect the cannibalism of grouper fry in recirculating aquaculture system(RAS). The specific improvements are as follows: (1) Multi-head attention mechanism is used to obtain global information in the last block of the backbone network (2) According to the idea of BIfpn, nodes that contribute little in the feature network which combines different features the neck is deleted. At the same time, feature fusion is added for nodes whose input and output are in the same layer (3) The lightweight general upsampling operator Carafe is used to replace the original upsampling. The experimental results show that the improved YOLOv5 model achieves 97% accuracy in the detection of cannibalism behavior of juvenile fish, effectively solving the problems of small targets, severe occlusion and motion blur in the culture environment. Meanwhile, compared with the YOLOv5s model, the improved model has 12.6% and 14% improvement in detection accuracy and processing speed, respectively, and achieves the requirements of high accuracy and real-time detection. In the actual aquaculture environment, farmers can take corresponding measures according to the detection results, which can effectively improve the survival rate and economic benefits of aquaculture.

Introduction

As a key component of aquaculture products, fish is an important material basis for ensuring the supply of high-quality protein, and plays a central role in global food and nutrition security. At this stage, fish farming is gradually transforming into a new intensive farming model with factory and intelligence. As the scale of aquaculture continues to expand, more and more attention has been paid to fish welfare. The welfare of farmed fish will be reflected in fish behavior after being affected by physiological and environmental changes (Lee et al., 2003, Bracke and Hopster, 2006, Ashley, 2007, Kiessling et al., 2012, Bergqvist and Gunnarsson, 2013, Jones, 2013, Mattiasen et al., 2020). Cannibalism behavior is a common normal physiological phenomenon of grouper in the juvenile stage. During the juvenile stage of grouper, factors such as light intensity, population density, and growth differences can lead to the occurrence of cannibalism behavior (Smith and Reay, 1991). The mutual tearing between fry will damage fish health, affect fry growth and bring indirect losses to fish breeding and culture. Therefore, continuous monitoring and identification of cannibalism behavior will help reduce fish casualties and ensure fish welfare, which is of great practical significance for improving the economic efficiency of enterprises, improving fish welfare and ensuring food safety.

Computer vision (CV) technology is widely used in automatic fish identification, classification and production status monitoring, etc. due to its advantages of speed, objectivity and high precision (Hsiao et al., 2014, Yu et al., 2021). The early use of computer vision for target detection mainly relied on traditional machine learning methods. For example, (Yu et al., 2021) used Harris corner detection and Lucas-Kanade optical flow to extract the feature points and speed of sub-images to achieve the detection of abnormal behaviors of dynamic fish schools. The feature points and speed of the image are used to quantify special behaviors, so as to achieve the purpose of detection. However, the optical flow calculation process is complicated. It usually takes a long time, and requires a stable background image, which is greatly affected by light. Different from the former, (Hsiao et al., 2014) used the Gaussian Mixed Model (GMM) background modeling method to compare the video stream with the background to achieve underwater fish detection. This frame difference method is simple to calculate and sensitive to moving targets. However, due to the limitation of the algorithm, the model cannot extract all moving targets. For targets that move too slowly between frames, there is a large error in detection.

Machine learning requires manual extraction of features, which imposes a little requirement on the professionalism of operators. Meanwhile it has low robustness and is difficult to meet the accuracy requirements in a complex breeding environment. Deep learning is a data-driven approach that mines the data for high-dimensional features and deep information, making it quite easy to model non-linear relationships. Because of its excellent performance in the field of target detection, it is widely used in fish behavior detection (Zhao et al., 2018, Måløy et al., 2019, Hu et al., 2021, Wang et al., 2021): for example, (Måløy et al., 2019) proposed a deep learning-based dual-stream recurrent network (DSRN) that automatically captures the spatio-temporal behaviour of salmon during swimming using CNN and LSTM to achieve predicted feeding and non-feeding behaviour of salmon; (Zhao et al., 2018) developed an intensive RNN-based detection method for sudden gathering and sudden escape abnormal behavior in intensive farming fish schools, with an average detection precision of 89.89%. Similarly, as the current excellent single-stage detection algorithms- the YOLO series- are also widely used in various target detection tasks. For instance, (Hu et al., 2021) used an improved Yolov3-Lite to accurately identify fish starvation and hypoxia behaviors. (Wang et al., 2021) used the improved YOLOv5 and Siamrpn++ algorithms to realize real-time detection and tracking of abnormal fish behaviors and achieve better results.

YOLO has the advantages of high accuracy and fast detection speed, and has a good application prospect in the fish detection. However, the detection of juvenile fish and cannibalism behavior is still challenging, such as small targets and serious occlusion. In the initial stage of this research, we used the original YOLO algorithm for detection, and the effect was not satisfactory. In order to solve the above-mentioned problems, this paper proposes a method for detecting cannibalism behavior of juvenile fish in RAS environment. The method is based on the improved YOLO-V5 network with an optimized design to improve the detection precision of cannibalism behaviors. It is supposed to provide a theoretical basis for formulating scientific breeding volume and density as well as differentiating culture area. The following arrangements of this article are as follows: Part 2 introduces the data set, improved algorithm and improved network; Part 3 demonstrates the results and discussion; Part 4 gives the conclusion.

Section snippets

Materials and methods

The breeding, treatment and experiment of fish in this study were carried out in strict accordance with the guidelines of the Experimental Animal Welfare Ethics Committee of China Agricultural University (NO.AW30901202-5-1).

Model performance analysis

The experimental environment of the model used in this article is shown in Table 1.

The momentum of all algorithms in this experiment is set to 0.9, the initial learning rate is 0.001, and the weight offset is 0.0005. Considering the GPU memory limit during training, the batch size is set to 8, and a total of 100 iterations are performed.

This article uses precision, recall, Fscore, AP50, AP50:95 and FPSGPU as evaluation indicators. Precision represents the prediction accuracy of the positive

Performance analysis of the multi-head attention mechanism

The attention mechanisms are widely used in the field of image recognition, such as LRNet (Hu et al., 2019) and stand-alone networks (Ramachandran et al., 2019) explore local self-attention to avoid the heavy calculations brought by global self-attention. Axial-Attention (Wang et al., 2020a, Wang et al., 2020b, Wang et al., 2020c) decomposes global space attention into two separate axial attentions, thereby greatly reducing the amount of calculation. The multi-head attention mechanism is to

Conclusions

In order to overcome the challenges of occlusion, motion blur and small targets in the detection of juvenile fish and their cannibalism, this paper proposes an improved YOLOv5s network to identify the cannibalism behavior in underwater images. By adding an attention module to the YOLOv5s, improving the connection method of the feature fusion network, and introducing a lightweight upsampling operator, the detection precision and AP50 increased by 12.6% and 12.2% respectively, and the network

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This paper was supported by Ministry of Science and Technology of the People’s Republic of China (Grant No. 2019YFE0103700), Hebei Province Department of Science and Technology (Grant No. 20327217D) and The 2115 Talent Development Program of China Agricultural University.

References (24)

  • Hu, H., Zhang, Z., Xie, Z., Lin, S., 2019, Local Relation Networks for Image...
  • R.C. Jones

    Science, sentience, and animal welfare

    Biol. Philos.

    (2013)
  • Cited by (0)

    View full text