Abstract
Big data analysis is the process of gathering, managing and analyzing a large volume of data to determine patterns and other valuable information. Agricultural data can be a significant area of big data applications. The big data analysis for agricultural data can comprise the various data from both internal systems and outside sources like weather data, soil data, and crop data. Though big data analysis has led to advances in different industries, it has not yet been extensively used in agriculture. Several machine learning techniques are developed to cluster the data for the prediction of crop yield. However, it has low accuracy and low quality of the clustering. To improve clustering accuracy with less complexity, a Proximity Likelihood Maximization Data Clustering (PLMDC) technique is developed for both sparse and densely distributed agricultural big data to enhance the accuracy of crop yield prediction for farmers. In this process, unnecessary data is cleansed from the sparse and dense based agricultural data using a logical linear regression model. After that, the presented clustering method is executed depending on the similarity and weight-based Manhattan distance. The genetic algorithm (GA) is applied with a good fitness function to select the features from the clustered data. Finally, the decision support system is computed by the A-FP growth algorithm to predict the crop yields according to their selected features such as weather features and crop features. The results of the proposed PLMDC technique are better in case of clustering accuracy of both spare and densely distributed data with minimum time and space complexity. Based on the results observations, the PLMDC technique is more efficient than the existing methods.
Similar content being viewed by others
References
Khaki, S., Wang, L.: Crop yield prediction using deep neural networks. Front. Plant Sci. 10(621), 1–10 (2019)
Bose, P., Kasabov, N.K., Bruzzone, L., Hartono, R.N.: Spiking neural networks for crop yield estimation based on spatiotemporal analysis of image time series. IEEE Trans. Geosci. Remote Sens. 54(11), 6563–6573 (2016)
Mateo-Sanchis, A., Piles, M., Muñoz-Marí, J., Adsuara, J.E., Camps-Valls, G.: Synergistic integration of optical and microwave satellite data for crop yield estimation. Remote Sens. Environ. 234, 1–12 (2019)
Narkhede, U.P., Adhiya, K.P.: Evaluation of modified K-means clustering algorithm in crop prediction. Int. J. Adv. Comput. Res. 4(16), 709–807 (2014)
Parthasarathy, P., Vivekanandan, S.: A typical IoT architecture-based regular monitoring of arthritis disease using time wrapping algorithm. Int. J. Comput. Appl. 42(3), 222–232 (2020)
Verma, A., Jatain, A., Bajaj, S.: Crop yield prediction of wheat using fuzzy C means clustering and neural network. Int. J. Appl. Eng. Res. 13(11), 9816–9821 (2018)
Vijayarajeswari, R., Parthasarathy, P., Vivekanandan, S., Basha, A.A.: Classification of mammogram for early detection of breast cancer using SVM classifier and Hough transform. Measurement 146, 800–805 (2019)
Terliksiz, A.S. and Altýlar, D.T.: Use of deep neural networks for crop yield prediction: A case study of soybean yield in Lauderdale County, Alabama, USA. International Conference on Agro-Geoinformatics (Agro-Geoinformatics), pp. 1–4, (2019)
Bolton, D.K., Friedl, M.A.: Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agric. For. Meteorol. 173, 74–84 (2013)
Panchatcharam, P., Vivekanandan, S.: Internet of things (IOT) in healthcare–smart health and surveillance, architectures, security analysis and data transfer: a review. Int. J. Softw. Innov. (IJSI) 7(2), 21–40 (2019)
Janssen, S.J., Porter, C.H., Moore, A.D., Athanasiadis, I.N., Foster, I., Jones, J.: Towards a new generation of agricultural system data, models and knowledge products: information and communication technology. Agric. Syst. 155, 200–212 (2017)
Pantazi, X.: Wheat yield prediction using machine learning and advanced sensing techniques. Comput. Electron. Agric. 121, 57–65 (2016)
Schulze, C., Spilke, J., Lehnerb, W.: Data modelling for precision dairy farming within the competitive field of operational and analytical tasks. Comput. Electron. Agric. 59(1–2), 39–55 (2007)
Parthasarathy, P., Vivekanandan, S.: Detection of suspicious human activity based on CNN-DBNN algorithm for video surveillance applications. In 2019 Innovations in Power and Advanced Computing Technologies (i-PACT) (Vol. 1, pp. 1–7). IEEE (2019)
Kamilaris, A., Assumpcio, A., Blasi, A.B., Torrellas, M., Prenafeta-Boldú, F.X.: Estimating the Environmental Impact of Agriculture by Means of Geospatial and Big Data Analysis: The case of Catalonia From science to Society. Springer, Cham (2017)
Fan, W., Chong, C., Xiaoling, G., Hua, Y., Juyun, W.: Prediction of crop yield using big data. In: 8th International Symposium on Computational Intelligence and Design, pp. 255–260.
Parthasarathy, P., Vivekanandan, S.: Investigation on uric acid biosensor model for enzyme layer thickness for the application of arthritis disease diagnosis. Health Inf. Sci. Syst. 6(1), 1–6 (2018)
Nilakanta, S., Scheibe, K., Rai, A.: Dimensional issues in agricultural data warehouse designs. Comput. Electron. Agric. 60(2), 263–278 (2008)
Parthasarathy, P., Vivekanandan, S.: Biocompatible TiO2–CeO2 Nano-composite synthesis, characterization and analysis on electrochemical performance for uric acid determination. Ain Shams Eng. J. 11(3), 777–785 (2020)
He, Li., Coburn, C.A., Wang, Z.-J., Feng, W., Guo, T.-C.: Reduced prediction saturation and view effects for estimating the leaf area index of winter wheat. IEEE Trans. Geosci. Remote Sens. 57(3), 1637–1652 (2019)
Varadharajan, R., Priyan, M.K., Panchatcharam, P., Vivekanandan, S., Gunasekaran, M.: A new approach for prediction of lung carcinoma using back propogation neural network with decision tree classifiers. J. Ambient Intell. Hum. Comput. 58, 1–12 (2018)
Bang, S., Bishnoi, R., Chauhan, A.S., Dixit, A.K., Chawla, I.: Fuzzy logic based crop yield prediction using temperature and rainfall parameters predicted through ARMA, SARIMA, and ARMAX models. In: Twelfth International Conference on Contemporary Computing (IC3), pp. 1–6, (2019)
Mathan, K., Kumar, P.M., Panchatcharam, P., Manogaran, G., Varadharajan, R.: A novel Gini index decision tree data mining method with neural network classifiers for prediction of heart disease. Des. Autom. Embed. Syst. 22(3), 225–242 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vani, P.S., Rathi, S. Improved data clustering methods and integrated A-FP algorithm for crop yield prediction. Distrib Parallel Databases 41, 117–131 (2023). https://doi.org/10.1007/s10619-021-07350-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-021-07350-1