Autonomous development of theoretical framework for intelligence automation system using decision tree algorithm
Graphical abstract
Introduction
Big data classification is widely applied in the field of pattern recognition, fault diagnosis, information retrieval and target recognition. The basic principle of data classification is to extract useful features of large data flow of information, building a mathematical model, based on data clustering algorithm to achieve large data classification [1]. Methods include fuzzy C mean, mean K algorithm, gradient descent method bundle, calculation method, particle swarm algorithm and support vector machine. These algorithms are being used in large data classification problem such as two convex quadratic programming problems. In large data redundancy of the information processing, it becomes easy to fall into local optimum because the stability of the algorithm finite convergence is not good [[2],[3]]. Having aimed at the problem of classification of bad convergence of big data, this paper has proposed an optimization method for large data classification, based on probability statistics. The first construction of large numbers as per Two - Poisson Classification Model with infinite dimensional vector space of probability depends on the density estimation. Further, based on the classification of the objective function, the geometric neighborhood in the data clustering center constructs the confidence intervals, using two - Poisson differential equations. It also strives to find the confidence features for large data classification, and then processes the stable solution through clustering center feature value. In the Bernoulli space, it is being executed through the realization of the accurate classification of large data statistics [4].
With the continuous development of the internet, people have entered the era of information explosion where the data exist in massive form. They also get to know how to easily find the needs of the user data which becomes an important issue of concern [[4],[5]]. Big data classification can help the users to find the data they need and it, has higher application value. The traditional classification methods are such as fuzzy C and clustering data. The said models are based on data classification algorithm, grid density and data classification algorithm, which uses the distribution of sequence data. It formulates the data classification model, and the probability functional analysis of data distribution. Here, the scholars put forward the theory named Naive Bayesian Data Classification Model, based on analysis of the characteristics of the diversity of data groups. It is combined with the high order statistical feature modeling to realize the data classification in Naive Bayesian classification model [6]. Although this model has the advantages of good convergence, the data in a finite dimensional Morrey-Herz Convex Space Boundedness is not good. If the regional approximation optimization model aims to classify the data, then the convergence does not seem to be good. The Regional Approximation Optimization Model, classifies the data when the convergence is not good. This classification method, mathematical model are based on the data. Here, the results show that the data of the model has high accuracy, low error rate, convergence good [7].
The structure of the article starts with introduction section. Section 2 focuses on mathematical model representation. Section 3 describes about decision tree algorithm functionalities. Section 4 depicts the framework on digital archive platform. Section 5 emphasized overall proposed framework. Section 6 showcases on performance measures, Section 7 depicts on results and discussion and final section showcases on conclusion part.
Section snippets
A probabilistic mathematical model based on large data classification
Data mining primarily refers to the extraction of some potential information and knowledge. people have no idea in advance from incomplete, noisy, fuzzy and random data. Data mining is otherwise called data exploiting process. To a broader extent, data and information are the manifestations of knowledge. However, for data mining, more concern is relied on the rules, laws and constraints around it. Data mining is a cross-disciplinary discipline, involving the computer, mathematical statistics,
Decision tree based algorithm
A Decision Tree is a decision support tool and the structure is similar to the Binary Tree or the Multi-Way Tree. As far as the tree is concerned, each non-leaf node corresponds to a non-category attribute test in the training sample set where the path from the root node to the leaf node indicates one rule for one kind of classification, as it accumulated in [7]. Different rules give a rise to the powerful classification function of the decision tree. Decision tree is constructed from top to
Digital archives management platform using decision tree-based classification algorithm
With the help of decision tree based classification technology and its principles, the timely classification of information makes an easier access of correspondent data or information in a relatively short period while the people query for the data and information [[11],[13]].
First, the problem refers to large data mining, which gets information from large data. Secondly, in most of the data mining, the most time and effort are the deciding features. The success or failure of many data analysis
Overall design framework
The overall framework of the data acquisition system is composed of a control center, data storage layer, data processing layer, data acquisition layer, and network access layer. In the aspect of system design, the whole system design uses AT89C52 as the main processing chip and combines it with REALTEK company's 10 M Ethernet control chip RTLSO19AS to make AT89C52 drive RTL8019AS work. It further aims to achieve the purpose of the interconnection of greenhouse data mining system and external
Performance measure
To assess the system's performance, the following performance measures are employed:
Mean absolute error
The mean absolute error between paired observations describing the same phenomena is a measure of errors. The mean absolute error (MAE) is a statistics for evaluating a regression model.
Where,
- •
yi represents the prediction value
- •
xi represents the true value of system
- •
n represents the number of data points
Results and discussion
The adequacy of the suggested homomorphic encryption based profound learning models for deformity confinement and location are introduced in this part. The Deep Neural Network (DNN) is the AI model that has been utilized to prepare on both ordinary and encoded information. In this paradigm, there are three types of layers namely Information, Yield, and Secret. Due to the layers' intrinsic non-direct highlighting, the DNNs structure perceives and summarizes the fundamental examples and facts in
Conclusion
In order to improve the Data Mining and analysis capabilities, this paper has proposed a mathematical model for the probability Classification Method based on a large data effectively in the field of data classification model, pattern recognition, feature extraction, fault diagnosis and target recognition. The proposed method has an important significance of the method of data classification accuracy. Here, the average classification error rate is low and it also has other advantages like
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rajashree S is presently working as Assistant Professor in Sathyabama Institute of Science and Technology. Her research interests include data mining algorithms and Machine learning algorithms.
References (24)
- et al.
Probabilistic performance assessment of a coal- fired power plant
Appl Energy
(2015) - et al.
Characterization of uncertainty in probabilistic model using bootstrap method and its application to reliability of piles
Appl Math Model
(2015) Generic probabilistic prototype based classification of vectorial and proximity data
Neurocomputing
(2015)- et al.
Deep learning-based image recognition for autonomous driving
IATSS Res
(2019) - et al.
Probabilistic model of waiting times between large failures in sheared media
Phys Rev E
(2016) - et al.
Using ELM-based weighted probabilistic model in the classification of synchronous EEG BCI
Med Biol Eng Comput
(2017) - et al.
Probabilistic programming in python using PyMC
Statistics
(2015) - et al.
Predictive modeling of implantation outcome in an in vitro fertilization setting an application of machine learning methods
Med Decis Mak
(2015) - et al.
A probabilistic method for determining cortical dynamics during seizures
J Comput Neurosci
(2015) - et al.
Probabilistic tsunami damage assessment considering stochastic source models: application to the 2011 tohoku earthquake
Coast Eng J
(2015)
Application of probabilistic approach to evaluate coalbed methane resources using geological data of coal basin in Indonesia
Geosci J
Efficient model selection for probabilistic K, nearest neighbour classification
Neurocomputing
Cited by (0)
Rajashree S is presently working as Assistant Professor in Sathyabama Institute of Science and Technology. Her research interests include data mining algorithms and Machine learning algorithms.