Original papers
Real-time sow behavior detection based on deep learning

https://doi.org/10.1016/j.compag.2019.104884Get rights and content

Highlights

  • The SBDA-DL network model is constructed based on the SSD and the MobileNet.

  • The model accelerates the network detection speed by compressing the SSD network model, that is, the last two convolutional layers in the detection network are deleted.

  • The algorithm was used for real-time detection of three typical sow behaviors: drinking, urination, and mounting.

  • The model is applied to the practical application of the real farm video detection algorithm, and the advantages of the algorithm in sow behavior detection are compared.

Abstract

Recording sow behaviors allows tracking their health status, the timely detection of abnormalities, and provides assistance to increase their health both physically and mentally. In recent years, detecting sow behavior using machine vision technology has become a popular research topic. However, current detection methods are based on the premise that an individual pig can be accurately identified. In this paper, A Real-Time Sow Behavior Detection Algorithm based on Deep Learning (SBDA-DL) is proposed. The algorithm was used for real-time detection of three typical sow behaviors: drinking, urination, and mounting. The experimental results show that the average precision (AP) of the algorithm to detect drinking, urination, and mounting behaviors is 96.5%, 91.4%, and 92.3%, respectively. The mean average precision (mAP) of a category is 93.4%, which can reach 7 frames per second in commonly configured microcomputers. The algorithm uses an optimized deep learning network structure to directly detect the sow behavior. This improves the accuracy of behavior detection at a processing speed required for real-time detection and meets the requirements of daily monitoring from auxiliary staff in most pig breeding farms.

Introduction

In 2017, China produced 68.861 million live pigs, an increase of 0.8% over 2016, and produced 52.99 million tons of pork, an increase of 0.5% over 2016, which accounted for more than half the global pork production (Chen et al., 2011). A sow is one of the most important members of a pig farm. Various types of behavioral data for sows are important when accurately predicting the estrus time (Xia et al., 2010).

Monitoring sow behavior can better reflect their health status. Feeding and drinking behaviors can help determine their food intake. When pigs have diarrhea and other diseases, their drinking behaviors can become abnormal (Kruse et al., 2011). Recording their urination behaviors can determine their urination frequency, and observed changes in sow urination can preliminarily judge their health status (Xia et al., 2015). Studying the mounting behavior of sows can determine their excitement level (Liu et al., 2013) and predict their estrus time.

There has been some research about observing or predicting pig behavior. Oczak et al. (2013) used a multilayer feedforward neural network to train five characteristic pairs of activity indices to identify aggressive behaviors in pigs. Jonguk et al. (2016) used a kinect depth sensor to identify aggressive behaviors in pigs. Kashiha et al. (2013) used a water meter to monitor the water use rate of each pigsty and used a camera to capture their drinking behaviors. They then studied the relationship between the water use rate and the drinking behavior. Madelyne et al. (2016) found that the drinking time of pigs measured using Radio Frequency Identification (RFID) was highly correlated with the observed time and quantity of drinking. Nasirahmadi et al. (2016) used an elliptical fit to locate pigs in pens and compared the relationship between the fitting proportion of two ellipses and the mounting behavior of pigs to measure whether mounting occurred along with its duration. Costa et al. (2013) analyzed the relationship between pig activity and climate change through image processing technology and found a significant correlation between the pig occupation index with the ventilation rate, temperature, and humidity. Yang et al. (2017) established a behavior recognition model of captive porcupines based on the structure of breeding pools and the actual porcupine activity. They recognized seven basic porcupine behaviors, such as resting, feeding, drinking, excretion, and biting on the iron gate and sink.

With the growth of deep learning, object detection techniques have developed extensively in recent years (Yu et al., 2013). The traditional target detection methods use haar-like features (Viola and Jones, 2001), histogram of oriented gradients (HOG) (Dalal et al., 2005), scale-invariant feature transform (SIFT) to extract image features (Lowe, 2004), support vector machine (SVM) (Andrew, 2000), and the adaptive boosting (AdaBoost) algorithm to perform target classification (Freund et al., 1997). The deformable part model (DPM) was proposed by Girshick et al. (2013) and uses a similar convolutional neural network (CNN) structure for target detection, which reached 43% mAP in the PASCAL VOC2007 competition (Everingham et al., 2015). The region-based CNN (R-CNN) proposed by Girshick et al. (2013) uses a selective search, a deep learning network, and a SVM to perform detection, feature extraction, and classification, respectively. This approach achieved 58% on the PASCAL VOC2007, which is more than 15% better than using the selective search alone for target detection (Uijlings et al., 2013). This also shows that deep learning can be successfully used for target detection.

The spatial pyramid pooling in deep convolutional network (SPP-Net) was proposed by He et al. (2015) and uses the spatial pyramid pooling layer to output a fixed-size feature image for an input image of arbitrary size. However, this method reduces the detection accuracy due to deformations in the detected image. The fast R-CNN proposed by Girshick et al. (2015) mapped a region box to the feature map of the previous convolutional layer. Such a detection scheme only needs to extract features once, which greatly improves the detection speed. The faster R-CNN proposed by Ren et al. (2017) replaces the detection part with a deep CNN, and directly performs detection and recognition within the feature map, which improves the accuracy of the PASCAL VOC2007 to 73%. However, these network frameworks have slow detecting speeds, making it difficult to apply them in real cases. Redmon et al. (2016) proposed the you only look once (YOLO) network framework, which is different from the previous target detection approaches because it uses a deep CNN to detect the target while also recognizing it. This comes with a slight reduction in the accuracy of the mAP, but has a greatly accelerated speed of 21 frames per second. The single shot multibox detector (SSD) network model was proposed by Liu et al. (2016), which improves the processing speed to 46 frames per second while maintaining an accuracy similar to the faster R-CNN.

The current detection method takes the accurate identification of individual pigs as the premise, and generally only detects a single pig behavior. There have been no scientific reports on the applications of simultaneously testing a variety of pig behaviors. Therefore, this paper proposes A Real-Time Sow Behavior Detection Algorithm based on Deep Learning (SBDA-DL), which is the first time sow behavior recognition has been described using a target detection method based on deep learning. The method can train a variety of categories with large differences on a small amount of data. The main sow behaviors of interest are drinking, urination, and mounting, which are detected in real-time. Moreover, the processing speed is accelerated while ensuring detection accuracy so that the algorithm can operate smoothly on conventional computation equipment.

Section snippets

Experimental device

This study was conducted at a sow farm of the Guangzhou Lizhi Agriculture Co., Ltd (http://www.lzpig.com/). A 3 million pixel (Haikang DS-2CD3355-I) infrared network camera was placed on a beam directly above the center of the pig pen. The length and width ratio of the pigsty is 6:5, the overall area is approximately 20 square meters, and the beam is around 5 m above the ground. A webcam is wired to the network video recorder (NVR) device and operates 24 h a day with a frame rate of 25 frames

SBDA-DL network model results

To verify the effectiveness of the proposed algorithm, the mAP of the most commonly used category in the object detection field is considered as the accuracy evaluate criterion. The mAP of a category is the average value of the sum of the APs for the various categories. The calculations are given in Eqs. (7), (8).AP=boundingboxgroundtruegroundtruemAP=1ni=1n(AP)iThe bounding box refers to the detection results of the image obtained using the object detection algorithm, and the ground truth

Conclusions

In this paper, an object detection method based on deep learning is proposed for the first time to detect the drinking, urination, and mounting behaviors of sows in real-time using videos of their behavior. The experimental results show that the average precisions for the proposed algorithm when detecting these sow behaviors are 96.5%, 91.4%, and 92.3%, respectively, and the mAP of the categories is 93.4%, which can reach 7 frames per second. The average category accuracy for the SBDA-DL

Acknowledgements

This work was supported by the National Key Research and Development Program of China (grant number 2017YFD0701601) and Guangdong Science and Technology Program of China (grant numbers 2019B020215004 and 2019B020215002).

References (38)

  • D. Erhan et al.

    Scalable object detection using deep neural networks

    IEEE Conf. Comp. Vision Pattern Recogn.

    (2014)
  • R. Girshick et al.

    Training deformable part models with decorrelated features

    IEEE Int. Conf. Comp. Vision

    (2013)
  • R. Girshick

    Fast r-cnn. ICCV

    (2015)
  • K. He et al.

    Spatial pyramid pooling in deep convolutional networks for visual recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • T. Huang et al.

    A fast two-dimensional median filtering algorithm

    IEEE Trans. Acoust. Speech Signal Process.

    (1979)
  • A.G. Howard et al.

    MobileNets: efficient convolutional neural networks for mobile vision applications

    CoRR

    (2017)
  • G.E. Hinton et al.

    Replicated softmax: An undirected topic model

    International Conference on Neural Information Processing Systems

    (2009)
  • Ioffe S, Szegedy C. Batch Normalization: Accelerating deep network training by reducing internal covariate shift. ICML,...
  • L. Jonguk et al.

    Automatic recognition of aggressive behavior in pigs using a Kinect depth sensor

    Sensors

    (2016)
  • Cited by (56)

    • A lightweight CNN-based model for early warning in sow oestrus sound monitoring

      2022, Ecological Informatics
      Citation Excerpt :

      Although monitoring methods with sensors are widely used, they require high-performance equipment and are prone to cause stress in livestock, which is not conducive to long-term monitoring. With the continuous development of science and technology, there are an increasing number of methods for detecting animal behaviours based on audio and video recognition technology (Cuan et al., 2020; Kumar et al., 2018; Sheng et al., 2020; Y. Zhang et al., 2019). Audio, as a nondestructive monitoring method for information collection, has advantages, such as simplicity of operation and low costs, over traditional identification and other methods, and it can be used for real-time monitoring around the clock.

    • Barriers to computer vision applications in pig production facilities

      2022, Computers and Electronics in Agriculture
    View all citing articles on Scopus
    1

    These authors contributed to the work equally.

    View full text