Original papersA supervised machine learning method to detect anomalous real-time broiler breeder body weight data recorded by a precision feeding system
Introduction
Applying computer technology has proved to be beneficial to animal agriculture. Hardware and software can be used to automatically monitor animal’s performance (Banhazi et al., 2012, Berckmans, 2014), making research and production less labor-intensive, while at the same time collecting big data that is helpful to interpret and improve animal performance. A current example is a precision feeding (PF) system for poultry, which was developed at the University of Alberta (Zuidhof et al., 2017, Zuidhof et al., 2019). It is a sequential feeding system that aims to increase the body weight (BW) uniformity in a flock of birds by allocating the right amount of feed over several small meals each day to birds on an individual basis. Birds are individually weighed in the PF station, and then a decision is made within the system on whether or not to feed the bird based on comparing its real-time BW to the target BW. Birds frequently visit the PF station to gain access to feed, and BW data are recorded upon each visit (Fig. 1). Visit frequency of breeder pullets from 2 to 22 weeks of age varied from 28 to 138 visits per day (Zuidhof, 2018). These data are likely to be contaminated by occasional anomalous observations, which can be caused by multiple birds entering the station at the same time, upward or downward variation in scale measurement in the recorded data due to the movement of the bird, or a misread for radio frequency identification (RFID) tag. These anomalous observations can cause incorrect estimations of daily BW and daily BW gain. Statistical methods and unsupervised learning methods may be used to detect the anomalies in real-time BW. These methods are effective to some extent, because they just focus on checking data distribution and they are incapable of distinguishing reasonable variations of BW caused by the feeding activity of birds from unreasonable variations of BW that cannot be explained by the feeding activity. Removing the anomalies in the data manually is accurate because people can judge anomalous observations by considering data distribution and features regarding the feeding activity of individual birds recorded by the PF system; however, it is time-consuming and labor-intensive. In the current study, a supervised machine learning method was used to detect anomalies in real-time BW of individual birds recorded by the PF system, based on manually labeled data. Variables regarding not only statistical distribution but also features associated with the feeding activity of individual birds recorded by the PF system were extracted from a dataset recorded by a PF system. Based on the labeled data, various machine learning algorithms were applied, and then the algorithm with the highest F1 score and area under the precision-recall curve (AUCPR) was selected to compare with 4 other common anomaly detection methods.
Section snippets
Method
Fig. 2 illustrates the key steps for developing the machine learning method to detect anomalies. In the current study, Python 3.7.0 was used to facilitate all the data analysis work including data preprocessing, feature engineering, algorithm selection, and comparison with other common anomaly detection methods. Scikit-learn library 0.21.0 (Pedregosa et al., 2011) and the deep learning framework Keras (Kumar and Manjula, 2019) were used to implemented machine learning algorithm.
Results
Table 3 shows the evaluation of 4 machine learning algorithms with optimized hyper-parameters. KNN had the highest precision (0.9746) and SVM had the highest recall (0.9917); however, RF had the highest F1 score (0.9712) that was the harmonic mean of precision and recall. In addition, Fig. 4 shows AUCPR of RF (0.9948) was higher than all other algorithms, indicating that RF was a more effective model for this imbalanced binary classification problem. Thus, RF was selected as the best algorithm
Discussion
The PF system recorded real-time broiler breeder BW in two dimensions: real-time BW and time. There were two characteristics for the recorded data: regularly shaped over a long period of time and irregularly scattered in one day (Fig. 1). Since the PF system fed each individual birds following a target BW curve that was a sigmoidal shape, real-time BW data of an individual bird throughout the trial (from day 15 to day 306) that were temporally sequenced can be described by a triphasic Gompertz
Conclusions
The current study was the first to propose a supervised machine learning method to detect anomalies in real-time BW data of broiler breeders collected by a PF system. Real-time BW data of 5 randomly selected broiler breeders were used in the current study. To detect the anomalous observations over the period of trial (from day 15 to day 306), 12 variables considering statistical distribution of data and features regarding the feeding activity recorded by the PF system for each day were created
CRediT authorship contribution statement
Jihao You: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - original draft. Edmond Lou: Conceptualization, Resources, Supervision, Writing - review & editing. Mohammad Afrouziyeh: Data curation, Writing - review & editing. Nicole Zukiwsky: Data curation, Writing - review & editing. Martin J. Zuidhof: Conceptualization, Project administration, Funding acquisition, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research data originated from a project financed by Alberta Agriculture and Forestry (Edmonton, Alberta). Authors would like to acknowledge students and staff of Poultry Research Center at the University of Alberta for technical support. Authors would also like to acknowledge the technical support from AI-Supercomputing Hub at University of Alberta.
References (32)
- et al.
Alternative k-nearest neighbour rules in supervised pattern recognition: Part 1. k-Nearest neighbour classification by using alternative voting rules
Anal. Chim. Acta
(1982) - et al.
The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases
J. Clin. Epidemiol.
(2015) - et al.
Classification of sentiment reviews using n-gram machine learning approach
Expert Syst. Appl.
(2016) Lifetime productivity of conventionally and precision-fed broiler breeders
Poult. Sci.
(2018)- et al.
Precision feeding: Innovative management of broiler breeder feed intake and flock uniformity
Poult. Sci.
(2017) - et al.
Precision livestock farming: an international review of scientific and commercial aspects
Int. J. Agric. Biol. Eng.
(2012) - et al.
Features and performance of some outlier detection methods
Journal of Applied Statistics
(2011) - Behera, S., Rani, R., 2016. Comparative analysis of density based outlier detection techniques on breast cancer data...
Precision livestock farming technologies for welfare management in intensive livestock systems
Rev. Sci. Tech.
(2014)- et al.
Area under the precision-recall curve: point estimates and confidence intervals
Joint European conference on machine learning and knowledge discovery in databases
(2013)
Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation
Behavior Research Methods
Skewness and kurtosis in function of selection of network traffic distribution
Acta Polytechnica Hungarica
Support-vector networks
Machine learning
Cited by (7)
Architecture of broiler breeder energy partitioning models
2022, Poultry ScienceCitation Excerpt :Energy requirement predicting models have been used to establish optimized levels of dietary nutrients and more profitable feeding programs for poultry (Sakomura, 2004), yet the effect of dividing BW and production data to different length of periods (chunk size) on the fitting and predictive performance of the models remains to be elucidated. We hypothesized that increasing data chunk size could account for unexplained variation in data caused by variation in health status and voluntary activity level of birds, anomalies in real-time BW data recorded by a precision feeding (PF) system (You et al., 2021), and environmental conditions. Furthermore, the effect of including random terms associated with different model parameters (individual maintenance ME and age) on the fitting performance of the models has been investigated (van der Klein et al., 2020).
The application of multi-elemental fingerprints and chemometrics for discriminating between cage and free-range table eggs based on atomic absorption spectrometry (AAS) and colorimetry
2023, Journal of Food Measurement and CharacterizationPrecision Livestock Farming (PLF) Systems: Improving Sustainability and Efficiency of Animal Production
2023, International Series in Operations Research and Management ScienceA non-linear time series based artificial intelligence model to predict outcome in cardiac surgery
2022, Health and Technology