Abstract
In this paper, we introduce an incremental induction of multivariate decision tree algorithm, called IIMDTS, which allows choosing a different splitting attribute subset in each internal node of the decision tree and it processes large datasets. IIMDTS uses all instances of the training set for building the decision tree without storing the whole training set in memory. Experimental results show that our algorithm is faster than three of the most recent algorithms for building decision trees for large datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel classifier for data mining. In: Proc. 22nd International Conference Very Large Databases, pp. 544–555 (1996)
Alsabti, K., Ranka, S., Singh, V.: CLOUDS: A decision tree classifier for large datasets. In: KDD, pp. 2–8 (1998)
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest - A frame- work for fast decision tree construction of large datasets. Data Mining and Knowledge Discovery 4, 127–162 (2000)
Yang, B., Wang, T., Yang, D., Chang, L.: BOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 405–416. Springer, Heidelberg (2008)
Gehrke, J., Ganti, V., Ramakrishnan, R., Loh, W.: BOAT - optimistic decision tree construction. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 169–180 (1999)
Yoon, H., Alsabti, K., Ranka, S.: Tree-based incremental classification for large datasets. Technical Report TR-99-013, CISE Department, University of Florida, Gainesville, FL. 32611 (1999)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proc. of Six Int. Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G., Martínez-Trinidad, J.F.: A new incremental algorithm for induction of multivariate decision trees for large datasets. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 282–289. Springer, Heidelberg (2008)
Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Machine Learning 19(1), 45–77 (1995)
Li, X.B., Sweigart, J.R., Teng, J.T., Donohue, J.M., Thombs, L.A., Wang, S.M.: Multivariate decision trees using linear discriminants and tabu search. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans 33(2), 194–205 (2003)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Adelman-McCarthy, J., Agueros, M.A., Allam, S.S.: Data Release 6. ApJS, 175 (in press, 2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Franco-Arcega, A., Carrasco-Ochoa, J.A., Sánchez-Díaz, G., Martínez-Trinidad, J.F. (2010). Multivariate Decision Trees Using Different Splitting Attribute Subsets for Large Datasets. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-13059-5_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)