A cluster-classification method for accurate mining of seasonal honey bee patterns
Introduction
Among all animal pollinators, insects, alone, were valued in €153 billions for their contribution to the pollination of crops worldwide, representing 9.5% of the total value of the world agricultural production used for human food just in 2005 (Gallai et al., 2009). Bees are the most important group of pollinators (Brown et al., 2016) and the Western honey bee (Apis mellifera) is the species most commonly used for pollination purposes around the world (Gallai et al., 2009; Potts et al., 2016). Approximately 75% of crops around the world depend on insects in general for agricultural production of fruits and/or seeds (Ollerton et al., 2011; Potts et al., 2010). Recent works have shown a reduction in the number of pollinating species worldwide (Potts et al., 2016). In particular, honey bee populations have suffered mass deaths in some European regions and in North America due to Colony Collapse Disorder (CCD) and severe winters (Barron, 2015; Gil-Lebrero et al., 2017).
To detect honey bee colonies abnormal states like low adult bee population, spotty brood pattern, and queen loss, it is usually necessary to open the beehives, remove the frames and check on them in a routine called an inspection. In addition to being an invasive process, a careful inspection generates colonies stress (Braga et al., 2020), which can put at risk the pollination services and also honey production. Additionally, bees can be crushed by the frames and box movements. Moreover, many colonies are kept in remote or distant rural apiaries so that inspections at such locations require long shifts. In this sense, remote monitoring of apiaries can assist beekeepers by adding valuable information on the bee's behavior without an invasive inspection (Kridi et al., 2016; Meikle et al., 2017; Murphy et al., 2016; Sánchez et al., 2015; Zacepins et al., 2017; Zogovic et al., 2017), as well as saving the bees from unnecessary stress or other non-productive activities.
Today, thanks to the sensor networks and Internet of Things paradigms, beekeepers and researchers can remotely monitor bee colonies (Kridi et al., 2016; Meikle and Holst, 2015; Zogovic et al., 2017). Remote monitoring via wireless sensors is one of the most important characteristics of the precision beekeeping (Zacepins et al., 2015) which basically involves beehives data collection, data analysis and support decision making in an apiary management context (Dineva and Atanasova, 2018). Once the sensors are installed in the hives, the apiary can be monitored without disturbance, even during periods when invasive inspections of the hives are contraindicated, such as during the winter (Meikle et al., 2017). However, little is yet known about the semantics of the data collected from the hives (Jacobs et al., 2017; Zacepins et al., 2015), such as which physical variables most affect the bees behavior. Such knowledge would help to improve, for instance, the bee colonies' well-being and pollination results.
Here we propose a method for knowledge discovery based on clustering for extracting seasonal honey bee patterns and then get an accurate classification model. Based on this method, our goal is to answer the following central Research Question (RQ): “How to identify biologically relevant seasonal patterns of bee colonies from different hives even if they are from different apiaries?”
The main contribution of this paper is a method that combines clusterization with data classification to detect and recognize seasonal honey bee patterns. To validate the proposed method, we have used two periods found on a yearly basis, the first period corresponds to the spring and summer seasons (the bee active period), and the second corresponds to the autumn and winter seasons in the northern hemisphere, the “quiet” period of the colonies (Kviesis and Zacepins, 2016).
These patterns are composed of temperature, humidity, and weight sensors measurements. To accomplish this, we have established the following activities as showed in Fig. 1:
- i.
obtained raw datasets of temperature, humidity, and weight of the bees' colonies during a full cycle year;
- ii.
removed anomalies and normalized data;
- iii.
split the colony datasets into subsets corresponding to the seasonal periods of a one-year cycle;
- iv.
defined the optimal quantity of seasonal patterns for each period;
- v.
collected the data seasonal patterns for each period;
- vi.
recognized and interpreted each data pattern by expert;
- vii.
labeled the dataset;
- viii.
split the dataset into training, test, and validation subsets;
- ix.
applied classification algorithms.
Section snippets
Material and methods
This section describes the methodological aspects of the research carried out concerning the tool used, data collection and preprocessing, machine learning strategies, as well as the analysis and detection of the bee colony states under study.
Results
In the Fig. 2 it is possible to observe the evolution of the value of the SSD for k = 2,5,10,15,20 and 24 for the 1o period in Arnas dataset. In this example, the convergence occurred in iteration 9 for k = 2. This behavior in relation to the convergence of k-means is repeated for the other periods. Then, the CH index was calculated to determine the amount of partitions that was most appropriate to each period (step 1.3 of Algorithm 1).
With the best prototypes of each k value defined, they were
Discussion
The interpretation and validation of the obtained clusters was done by an expert in beekeeping and by the knowledge available in articles on the subject. The expert in beekeeping have used also plots of external temperature and humidity to support the interpretation of clusters.
For the first period, from March to August 2017, the methodology used returned as a result 5 clusters to the beehive of the Arnas apiary. Their centroids are shown in Table 1, in the columns under the description “Arnas
Conclusion
Here we propose a method based on clustering and classification techniques for recognizing seasonal honey bee patterns that can be customized and integrated into a computer system for apiaries remote monitoring. This method takes into account three variables (internal temperature, internal humidity and hive weight) and can be customized to include other time-windows patterns on a weekly/monthly basis to detect, for instance, swarming, timing for seasonal management, and the incidence of
Acknowledgements
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Finance Code 001. Danielo G. Gomes and Breno Freitas thanks the financial support of the CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) [grant numbers #302934/2010-3, #310317/2019-3, #432585/2016-8, #129426/2018-0]. Joseph A. Cazier thanks Healthy Hives 2020 grant from Project Apis m [grant number #16-0181].
References (39)
Death of the bee hive: understanding the failure of an insect society
Curr. Opin. Insect Sci.
(2015)- et al.
Social context influences the initiation and threshold of thermoregulatory behaviour in honeybees
Anim. Behav.
(2013) - et al.
Economic valuation of the vulnerability of world agriculture confronted with pollinator decline
Ecol. Econ.
(2009) - et al.
Application of wireless sensor networks for beehive monitoring and in-hive thermal patterns detection
Comput. Electron. Agric.
(2016) - et al.
Global pollinator declines: trends, impacts and drivers
Trends Ecol. Evol.
(2010) - et al.
Implementation of an electronic system to monitor the thermoregulatory capacity of honeybee colonies in hives with open-screened bottom boards
Comput. Electron. Agric.
(2015) - et al.
Temperature changes above the upper hive body reveal the annual development periods of honey bee colonies
Comput. Electron. Agric.
(2013) - et al.
Challenges in the development of precision beekeeping
Biosyst. Eng.
(2015) - et al.
A dendrite method for cluster analysis
Commun. Stat.
(1974) - et al.
A review of impacts of temperature and relative humidity on various activities of honeybees
Insect. Soc.
(2017)
Applying the long-term memory algorithm to forecast thermoregulation capacity loss in honeybee colonies
A method for mining combined data from in-hive sensors, weather and apiary inspections to forecast the health status of honey bee colonies
Comput. Elect. Agric.
Random forests
Mach. Learn.
A horizon scan of future threats and opportunities for pollinators and pollination
PeerJ
Nearest neighbor pattern classification
IEEE Trans. Inf. Theory
Osemn process for working over data acquired by iot devices mounted in beehives
Curr. Trends Nat. Sci.
Design and development of a smart weighing scale for beehive monitoring
Honey bee colonies remote monitoring system
Sensors
Data Mining: Concepts and Techniques
Cited by (8)
Machine learning and materials informatics approaches in the analysis of physical properties of carbon nanotubes: A review
2022, Computational Materials ScienceCitation Excerpt :DT tends to overfit and high memory usage, and can generate models with high variation [145], although this problem can be reduced with the necessary configurations. RF shows slow learning, high memory consumption, and difficult interpretation of the generated models [146]. SVM can present a long training time and low performance when outliers are presented [147].
Forecasting sudden drops of temperature in pre-overwintering honeybee colonies
2021, Biosystems EngineeringCitation Excerpt :Thus, control of the inner hive temperature is vital to bee colony health and its loss may indicate the colony is facing a problem. In this paper, machine learning is applied to predict homeostasis loss, since the use of machine learning techniques has already been shown to be a viable alternative for analysing beekeeping data (Braga, Gomes, Rogers, et al., 2020,a). To reduce the need for unnecessary manual inspections, a calibrated long short-term memory (LSTM) algorithm is used to forecast the internal temperature in honeybee colonies.
Deep learning-based classification models for beehive monitoring
2021, Ecological InformaticsCitation Excerpt :The study enabled participatory sensing using mobile phones and a cloud-based platform. In (Braga et al., 2020), a new method was proposed for classifying seasonal honey bee patterns. The method aimed to assist the beekeepers in the management and maintenance of their hives.
Technological Adoption and Challenges in Beekeeping: A Review
2023, 2023 IEEE International Conference on Agrosystem Engineering, Technology and Applications, AGRETA 2023