Abstract
This paper analyzes the existing decision tree algorithms for dealing with multi-valued and multi-labeled data. These algorithms have the following shortcomings: The choice of which attributes is difficult and the calculation for similarity is not precise enough. Based on these deficiencies, this paper proposes a new decision tree algorithm for multi-valued and multi-labeled data (AMDT). In the algorithm, firstly a new formula sim5 is proposed for calculating the similarity between two label-sets in the child nodes. It comprehensively considers the condition which the elements appear and not appear in both of the two label-sets at the same time and adjusts the proportion of them by the coefficient α, so that the similarity calculations of the label-sets are more comprehensive and accurate. Secondly, we propose the new conditions of the corresponding node to stop splitting. Lastly, we give the prediction method. Results of comparison experiments with the existing algorithms (MMC, SSC and SCC_SP_1) show that AMDT has the higher predictive accuracy.

Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Quinlan JR (1986) Introduction of decision trees. Mach Learn 1:81–106
Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234
Chandra B, Kothari R, Paul P (2010) A new node splitting measure for decision tree construction. Pattern Recognit 43:2725–2731
Chang PC, Fan CY, Dzan WY (2010) A CBR-based fuzzy decision tree approach for database classification. Expert Syst Appl 37:214–225
Nandgaonkar S, Vahida Z, Pradip K (2009) Efficient decision tree construction for classifying numerical data. In: International conference on advances in recent technologies in communication and computing, pp 761–765
Wang XZ, Yang CX (2007) Merging-branches impact on decision tree induction. Chin J Comput 30(8):1251–1258
Wei JM et al (2006) Rough set based approach for inducing decision trees. In: RSKT 2006, LNAI, vol 4062, pp 421–429
Pang HL, Gao ZW et al (2008) Study on constructing method of classifying decision tree based on variable precision rough set. Syst Eng Electron 30(11):2160–2163 (in Chinese)
Miao DQ, Wang J (1997) Rough sets based approach for multivariate decision tree construction. J Softw 8(6):425–431 (in Chinese)
Liang DL, Huang GX et al (2008) A new multivariate decision tree algorithm. Comput Sci 5(1):211–212 (in Chinese)
Chen Y, Hsu C (2003) Constructing a multi-valued and multi-labeled decision tree. Expert Syst Appl 25(2):199–209
Chou S, Hsu C (2005) MMDT: a multi-valued and multi-labeled decision tree classifier for data mining. Expert Syst Appl 28(2):799–812
Zhao R, Li H (2007) Algorithm of multi-valued attribute and multi-labeled data decision tree. Comput Eng 33(13):87–89 (in Chinese)
Li H, Chen SQ et al (2007) A multi-valued attribute and multi-labeled data decision tree algorithm. PR & AI 20(6):815–820 (in Chinese)
Shafer JC, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22th international conference on very large databases, pp 544–555
Agrawal R, Ghosh S, Imielinski T et al (2005) An interval classifier for database mining applications. In: Proceedings of the 18th international conference on very large databases. pp 560–573
Wang H, Zaniolo C (2002) CMP: a fast decision tree classifier using multivariate predictions. In: Proceedings of the 16th international conference on data engineering, pp 449–460
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant No. 61073133, 60773084, 60603023, and the National Research Foundation for the Doctoral Program of Higher Education of China under Grant No.20070151009.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yi, W., Lu, M. & Liu, Z. Multi-valued attribute and multi-labeled data decision tree algorithm. Int. J. Mach. Learn. & Cyber. 2, 67–74 (2011). https://doi.org/10.1007/s13042-011-0015-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-011-0015-2