Abstract
In this paper, we proposed a modified decision tree learning algorithm. We tried to improve the conventional decision tree learning algorithm. There are some approaches to do it. These methods have a modified learning phase and a decision tree made by them includes some new attributes and/or class label gotten by modified process. As a result, It is possible that exists modified decision tree learning algorithm degrade of the comprehensibility of a decision tree. So we focus on the prediction phase and modified it. Our proposed approach makes a binary decision tree based on ID3, which is one of well-known conventional decision tree learning algorithms and predicts the class label of new data items based on K-NN instead of the algorithm used in ID3 and most of the conventional decision tree learning algorithm. Most of the conventional decision tree learning algorithms predicts a class label based on the ratio of class labels in a leaf node. They select the class label which has the highest proportion of the leaf node. However, when it is not easy to classify dataset according to class labels, leaf nodes includes a lot of data items and class labels. It causes to decrease the accuracy rate. It is difficult to prepare good training dataset. So we predict a class label from k nearest neighbor data items selected by K-NN in a leaf node. We implemented three programs. First program is based on our proposed approach. Second program is based on the conventional decision tree learning algorithms and third program is based on K-NN. In order to evaluate our approach, we compared these programs using a part of open datasets from UCL learning repository. Experimental result shows our approach is better than others.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth Statistics/Probability. CRC-Press, Boca Raton (1984)
Corinna, C., Vladimir, N.V.: Support-Vector Networks. Mach. Learn. 20(3), 237–297 (1995). Kluwer Academic Publishers, Hingham
Rosenblatt, F.: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Michigan (1962)
Weka 3 - Data Mining with Open Source Machine Learning Software in Java. http://www.cs.waikato.ac.nz/ml/weka/. Accessed 12 March 2014
The R Project for Statistical Computing. http://www.r-project.org/. Accessed 12 March 2014
IBM SPSS software. http://www-01.ibm.com/software/analytics/spss/. Accessed 12 March 2014
Bach, M.P., Ćosić, D.: Data mining usage in health care management: literature survey and decision tree application. Medicinski Glas. 5(1), 57–64 (2008). Bosnia and Herzegovina
MacQueen, J. B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkley (1967)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). Kluwer Academic Publishers, Hingham
Kurematu, M., Hakura, J., Fujita, H.: An extraction of emotion in human speech using speech synthesize and each classifier for each emotion. WSEAS Trans. Inf. Sci. Appl. 3(5), 246–251 (2008). World Scientific and Engineering Academy and Society, Greece
Terabe, M., Katai, O., Sawaragi, T., Washio, T., Motoda, H.: Attribute generation based on association rules. Knowl. Inf. Syst. 4(3), 329–349 (2002). Springer, Heidelberg
Liang, Y.-H.: Combining the K-means and decision tree methods to promote customer value for the automotive maintenance industry. In: IEEE International Conference on Industrial Engineering and Engineering Management 2009, pp.1337–1341. IEEE, Hong-Kong (2009)
Gaddam, S.R., Phoha, V.V., Balagani, K.S.: K-Means+ID3: a novel method for supervised anomaly detection by cascading k-Means clustering and ID3 decision tree learning methods. IEEE Trans. Knowl. Data Eng. 9(3), 345–354 (2007). IEEE, Washington
Kurematsu, M., Amanuma, S., Hakura, J., Fujita, H.: An extraction of emotion in human speech using cluster analysis and a regression tree. In: Proceedings of 10th WSEAS International Conference on Applied Computer Science, pp. 346–350. World Scientific and Engineering Academy and Society, Greece (2010)
Amanuma, S., Kurematsu, M., Fujita, H.: An idea of improvement decision tree learning using cluster analysis. In: The 11th International Conference on Software Methodologies, Tools and Techniques, pp.351–360. IOS Press, Amsterdam (2012)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967). IEEE, Washington
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). Springer, Heidelberg
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets/. Accessed 29 March 2014
Acknowledgment
This work was supported by Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (C):24500121. We would like to thank Ms. Saori AMANUMA who has completed a master’s course of the graduate school of Iwate Prefectural University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kurematsu, M., Hakura, J., Fujita, H. (2015). A Framework for a Decision Tree Learning Algorithm with K-NN. In: Fujita, H., Selamat, A. (eds) Intelligent Software Methodologies, Tools and Techniques. SoMeT 2014. Communications in Computer and Information Science, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-17530-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-17530-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17529-4
Online ISBN: 978-3-319-17530-0
eBook Packages: Computer ScienceComputer Science (R0)