Autonomous development of theoretical framework for intelligence automation system using decision tree algorithm

https://doi.org/10.1016/j.compeleceng.2022.108131Get rights and content

Highlights

Abstract

A novel technique is proposed to arrange enormous information on likelihood which actually depends on the Numerical Model. Having adopted the Two-Fold Development Investigation Model, the non-linear differential conditions are presented using information measurements, characterization of stable and curved capacity. This becomes an ideal estimation technique to build the parameters such as test insights, likelihood, thickness, capacity of the dispersion of information and grouping. The present research also aims to get the information arrangement by Sigma Test, examines the likelihood rule, rejects the span, and demonstrates the solidness of the numerical model & the progressive union. Through the reenactment, information investigation, the available results of the proposed method show that the model precision rate is high, the normal mistake rate is low, and it also becomes acceptable in the context of assembly. Here, the ideal forecast of enormous data is being done by control input arrangement, the ideal target work, utilizing the mayhem factors, nonlinear arbitrary crossing of classification, focus crossing. It also sets up the topological connection between information focuses, and further develops the order calculation. Recreation results show that the proposed calculation can adequately work on the exactness of huge information order and decrease the misclassification rate.

Introduction

Big data classification is widely applied in the field of pattern recognition, fault diagnosis, information retrieval and target recognition. The basic principle of data classification is to extract useful features of large data flow of information, building a mathematical model, based on data clustering algorithm to achieve large data classification [1]. Methods include fuzzy C mean, mean K algorithm, gradient descent method bundle, calculation method, particle swarm algorithm and support vector machine. These algorithms are being used in large data classification problem such as two convex quadratic programming problems. In large data redundancy of the information processing, it becomes easy to fall into local optimum because the stability of the algorithm finite convergence is not good [[2],[3]]. Having aimed at the problem of classification of bad convergence of big data, this paper has proposed an optimization method for large data classification, based on probability statistics. The first construction of large numbers as per Two - Poisson Classification Model with infinite dimensional vector space of probability depends on the density estimation. Further, based on the classification of the objective function, the geometric neighborhood in the data clustering center constructs the confidence intervals, using two - Poisson differential equations. It also strives to find the confidence features for large data classification, and then processes the stable solution through clustering center feature value. In the Bernoulli space, it is being executed through the realization of the accurate classification of large data statistics [4].

With the continuous development of the internet, people have entered the era of information explosion where the data exist in massive form. They also get to know how to easily find the needs of the user data which becomes an important issue of concern [[4],[5]]. Big data classification can help the users to find the data they need and it, has higher application value. The traditional classification methods are such as fuzzy C and clustering data. The said models are based on data classification algorithm, grid density and data classification algorithm, which uses the distribution of sequence data. It formulates the data classification model, and the probability functional analysis of data distribution. Here, the scholars put forward the theory named Naive Bayesian Data Classification Model, based on analysis of the characteristics of the diversity of data groups. It is combined with the high order statistical feature modeling to realize the data classification in Naive Bayesian classification model [6]. Although this model has the advantages of good convergence, the data in a finite dimensional Morrey-Herz Convex Space Boundedness is not good. If the regional approximation optimization model aims to classify the data, then the convergence does not seem to be good. The Regional Approximation Optimization Model, classifies the data when the convergence is not good. This classification method, mathematical model are based on the data. Here, the results show that the data of the model has high accuracy, low error rate, convergence good [7].

The structure of the article starts with introduction section. Section 2 focuses on mathematical model representation. Section 3 describes about decision tree algorithm functionalities. Section 4 depicts the framework on digital archive platform. Section 5 emphasized overall proposed framework. Section 6 showcases on performance measures, Section 7 depicts on results and discussion and final section showcases on conclusion part.

Section snippets

A probabilistic mathematical model based on large data classification

Data mining primarily refers to the extraction of some potential information and knowledge. people have no idea in advance from incomplete, noisy, fuzzy and random data. Data mining is otherwise called data exploiting process. To a broader extent, data and information are the manifestations of knowledge. However, for data mining, more concern is relied on the rules, laws and constraints around it. Data mining is a cross-disciplinary discipline, involving the computer, mathematical statistics,

Decision tree based algorithm

A Decision Tree is a decision support tool and the structure is similar to the Binary Tree or the Multi-Way Tree. As far as the tree is concerned, each non-leaf node corresponds to a non-category attribute test in the training sample set where the path from the root node to the leaf node indicates one rule for one kind of classification, as it accumulated in [7]. Different rules give a rise to the powerful classification function of the decision tree. Decision tree is constructed from top to

Digital archives management platform using decision tree-based classification algorithm

With the help of decision tree based classification technology and its principles, the timely classification of information makes an easier access of correspondent data or information in a relatively short period while the people query for the data and information [[11],[13]].

First, the problem refers to large data mining, which gets information from large data. Secondly, in most of the data mining, the most time and effort are the deciding features. The success or failure of many data analysis

Overall design framework

The overall framework of the data acquisition system is composed of a control center, data storage layer, data processing layer, data acquisition layer, and network access layer. In the aspect of system design, the whole system design uses AT89C52 as the main processing chip and combines it with REALTEK company's 10 M Ethernet control chip RTLSO19AS to make AT89C52 drive RTL8019AS work. It further aims to achieve the purpose of the interconnection of greenhouse data mining system and external

Performance measure

To assess the system's performance, the following performance measures are employed:

Mean absolute error

The mean absolute error between paired observations describing the same phenomena is a measure of errors. The mean absolute error (MAE) is a statistics for evaluating a regression model.MAE=((i=1)n|yixi|)/n

Where,

  • yi represents the prediction value

  • xi represents the true value of system

  • n represents the number of data points

Results and discussion

The adequacy of the suggested homomorphic encryption based profound learning models for deformity confinement and location are introduced in this part. The Deep Neural Network (DNN) is the AI model that has been utilized to prepare on both ordinary and encoded information. In this paradigm, there are three types of layers namely Information, Yield, and Secret. Due to the layers' intrinsic non-direct highlighting, the DNNs structure perceives and summarizes the fundamental examples and facts in

Conclusion

In order to improve the Data Mining and analysis capabilities, this paper has proposed a mathematical model for the probability Classification Method based on a large data effectively in the field of data classification model, pattern recognition, feature extraction, fault diagnosis and target recognition. The proposed method has an important significance of the method of data classification accuracy. Here, the average classification error rate is low and it also has other advantages like

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Rajashree S is presently working as Assistant Professor in Sathyabama Institute of Science and Technology. Her research interests include data mining algorithms and Machine learning algorithms.

References (24)

  • Y.M. Kim et al.

    Application of probabilistic approach to evaluate coalbed methane resources using geological data of coal basin in Indonesia

    Geosci J

    (2016)
  • W.Y. Ji et al.

    Efficient model selection for probabilistic K, nearest neighbour classification

    Neurocomputing

    (2015)
  • Cited by (0)

    Rajashree S is presently working as Assistant Professor in Sathyabama Institute of Science and Technology. Her research interests include data mining algorithms and Machine learning algorithms.

    View full text