Original papersC5.0: Advanced Decision Tree (ADT) classification model for agricultural data analysis on cloud
Introduction
Agriculture is the backbone of India’s economy. It also provides large employment opportunities to the people. Nowadays, the crop cultivation becomes very low since many farmers are still following the traditional farming methods. So the farmers, governments, agricultural scientists and the researchers are exploring new methods to get high yield from the farmland. Government of India is concentrating on the agriculture industry by handling several activities to increase the crop productivity (Vincya and Valarmathi, 2016). Based on the Green, White, Yellow, and Blue revolutions, new ideas can be introduced on the particular crop. It depends on the natural factors like water, soil, climate changes, etc. Crop production mainly depends on the soil fertility. So, it is mandatory to identify the low nutrients content and to maximize the same to get high quality crops (Soil test, 2016).
In India, the primary nutrients (Nitrogen, Phosphorous, and Potassium) are very deficient in the agricultural soil. So, it is mandatory to take necessary action for overcoming the deficiency by applying the correct requirement of fertilizers for the crop. Soil health is the combination of physical, chemical and biological activities of soil. The essential goal of any technology with respect to agriculture is to make the better crop production with many immediate and sustainable benefits (Athmaja and Hanumanthappa, 2016). Farmers are not only harvesting the crops and vegetables, but they also harvest the massive amount of data. Therefore, there is a need of big data analytics to analyze the same. It works with large datasets and problematical processing is done by database management applications or traditional data processing tools.
MapReduce is used to access the big data sets with a parallel distributed algorithm and it is implemented with a programming model. It performs two operations: one is Map () and another is Reduce (). Map () takes care about filtering and sorting the data and Reduce () concentrates on summarizing the data (Bhargavi and Jyothi, 2011). Data mining discovers the unknown information from the dataset. The soil nutrient information, soil composition and soil characteristics are required for determining the soil fertility level. Also, the measure of soil fertility indicates the deficiencies that need to take treatment are based on the soil test. In agricultural production, soil fertility is the important factor for crop growth and productivity.
Most of the farmers are unaware of soil testing and also not having pertinent knowledge about their land soil (Chandrakar et al., 2011). The soil test reports enable the farmers to choose the type of fertilizer and to know about the use of fertilizer based on the soil requirement. The excessive distribution of fertilizers is one of the major problems in agricultural domain. It should be rectified by analyzing the soil fertility level (Han and Kamber, 2012). In addition, it's important to define the time of distribution, type and quantity of fertilizers for healthy crop growing. Therefore, this work covers the building steps for producing an efficient and perfect predictive model for soil fertility with the help of our proposed C5.0: ADT Classifier algorithm. The motivation of this work is to assist Virudhunagar District farmers via mobile app to make the smart decision to optimize the crop yield based on soil nutrient availability and fertilization.
The rest of the paper is organized as follows: Section 2 elucidates about the different decisions tree algorithms using agricultural data; Section 3 demonstrates the proposed method work flow; Section 4 proves the proposed method experimental outcomes; and finally Section 5 provides the overall conclusion of the paper.
Section snippets
Related work
The simulation models play a very significant role in the development of the agro-ecological and socio-economic conditions. Several applications are in use, based on data mining algorithms in the agriculture field. Decision Support Systems (DSS) are used to generate data for agriculture management systems viz., the pest management, farm management and crop management system. The performance of these systems is low. Hence, the utilization of IoT based advanced techniques could improve the
System node and architecture
In this work, a new well-designed model is used to predict the soil fertility level. In general, the soil test report contains the information about Macronutrient and Micronutrient contents of the given soil. From the soil test reports, it is observed that Virudhunagar District soil contains maximum amount of Macronutrient content viz., Nitrogen, Phosphorus and Potassium and the required amount of fertilizer for crop growing is calculated accordingly.
The proposed method workflow is shown in
Implementation
R language is an open source statistical software and it is mostly used to process the statistical data. It does not handle the large amount of data. So, RHadoop, RmR, Rhdfs packages are used to integrate the R and Hadoop environment. It works with the terra bytes of data and easily handles the data. The proposed method code was written with the help of the RHadoop and RmR packages. The flow of proposed system is shown in Fig. 2.
Conclusion
The proposed model helps the farmers to analyze their farmland soil nutrient information. This is one of the essential requirements towards the agriculture sector to increase the crop production with a less utilization of fertilizer requirements and sustain the soil nutrient with good health. Through soil test, the farmers know the present requirements of fertilizer for the particular crop. These proposed models also facilitate the estimation of fertility level and suggests the suitable crop
Conflict of interest
The authors have no conflict of interest to declare.
Acknowledgments
The authors are grateful to the Management of Kalasalingam Academy of Research and Education for providing fellowship and thanks are due to National Cyber Defence Research Centre for supporting laboratory facilities during this research work.
References (21)
- et al.
Feature selection and overlapping clustering-based multilabel classification model
Hindawi Math. Probl. Eng.
(2018) - et al.
Applications of mobile cloud computing and big data analytics in agriculture sector – a survey
Int. J. Adv. Res. Comput. Commun. Eng.
(2016) - et al.
Soil classification using data mining techniques: a comparative study
Int. J. Eng. Trends Technol.
(2011) - et al.
Applying classification techniques in Data Mining in agricultural land soil
Int. J. Comput. Eng.
(2011) - et al.
Use of mobile multimedia agricultural advisory systems by Indian farmers: results of a survey
J. Agric. Extens. Rural Devel.
(2013) - et al.
Soil data analysis using classifcation techniques and soil attribute prediction
Int. J. Comput. Sci. Issues
(2012) - et al.
Machine learning for soil fertility and plant nutrient management using back propagation neural networks
Int. J. Recent Innovat. Trends Comput. Commun.
(2014) - et al.
Data Mining: Concepts and Techniques
(2012) - et al.
Classification of soil type in salem district using J48 Algorithm
Int. J. Comput. Technol. Appl.
(2016) - Virudhunagar District Profile:...
Cited by (53)
Trustworthy remote sensing interpretation: Concepts, technologies, and applications
2024, ISPRS Journal of Photogrammetry and Remote SensingSmart farming using artificial intelligence: A review
2023, Engineering Applications of Artificial IntelligencePrediction models on biomass and yield of rice affected by metal (oxide) nanoparticles using nano-specific descriptors
2022, NanoImpactCitation Excerpt :The model calculation methods are shown in Text S1–S4. C5.0 DT is a typical classification algorithm based on the decision tree algorithm, widely used in machine learning and data mining (Rajeswari and Suthendran, 2019). The decision tree uses a greedy local search algorithm to separate the observations into branches and construct a tree structure based on decision rules to improve the accuracy of the prediction (Rao et al., 2019).
Implementation of C5.0 Algorithm using Chi-Square Feature Selection for Early Detection of Hepatitis C Disease
2024, Journal of Electronics, Electromedical Engineering, and Medical InformaticsAnalysis of crop prediction models using data analytics and ML techniques: a review
2024, Multimedia Tools and Applications