skip to main content
research-article

A Practical Tutorial for Decision Tree Induction: Evaluation Measures for Candidate Splits and Opportunities

Published:21 January 2021Publication History
Skip Abstract Section

Abstract

Experts from different domains have resorted to machine learning techniques to produce explainable models that support decision-making. Among existing techniques, decision trees have been useful in many application domains for classification. Decision trees can make decisions in a language that is closer to that of the experts. Many researchers have attempted to create better decision tree models by improving the components of the induction algorithm. One of the main components that have been studied and improved is the evaluation measure for candidate splits.

In this article, we introduce a tutorial that explains decision tree induction. Then, we present an experimental framework to assess the performance of 21 evaluation measures that produce different C4.5 variants considering 110 databases, two performance measures, and 10× 10-fold cross-validation. Furthermore, we compare and rank the evaluation measures by using a Bayesian statistical analysis. From our experimental results, we present the first two performance rankings in the literature of C4.5 variants. Moreover, we organize the evaluation measures into two groups according to their performance. Finally, we introduce meta-models that automatically determine the group of evaluation measures to produce a C4.5 variant for a new database and some further opportunities for decision tree models.

Skip Supplemental Material Section

Supplemental Material

References

  1. A. Adadi and M. Berrada. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018), 52138--52160.Google ScholarGoogle ScholarCross RefCross Ref
  2. A. Albu. 2017. From logical inference to decision trees in medical diagnosis. In Proceedings of the E-Health and Bioengineering Conference (EHB’17). 65--68.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. M. Ali and S. D. Silvey. 1966. A general class of coefficients of divergence of one distribution from another. J. Roy. Stat. Soc. Series B (Methodol.) 28, 1 (1966), 131--142.Google ScholarGoogle Scholar
  4. J. Alvarado-Uribe, A. Gomez-Oliva, A. Y. Barrera-Animas, G. Molina, M. Gonzalez-Mendoza, M. C. Parra-Merono, and A. J. Jara. 2018. HyRA: A hybrid recommendation algorithm focused on smart POI. Ceuti as a study scenario. Sensors (Basel) 18, 3 (2018).Google ScholarGoogle Scholar
  5. A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. 2019. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arxiv:1910.10045 (2019).Google ScholarGoogle Scholar
  6. L. A. Badulescu. 2007. The choice of the best attribute selection measure in decision tree induction. Annals of University of Craiova, Math. Comp. Sci. Ser. 34, 1(2007), 88--93.Google ScholarGoogle Scholar
  7. L. A. Badulescu. 2016. Pruning methods and splitting criteria for optimal decision trees algorithms. Annals of University of Craiova, Series: Automation, Computers, Electronics and Mechatronics 13, 40, Article 1 (2016), 15--19.Google ScholarGoogle Scholar
  8. L. A. Badulescu. 2017. Data mining classification experiments with decision trees over the forest covertype database. In Proceedings of the 21st International Conference on System Theory, Control and Computing (ICSTCC’17). 236--241.Google ScholarGoogle ScholarCross RefCross Ref
  9. A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, R. Chatila, and F. Herrera. 2020. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58 (2020), 82--115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. C. Barros, A. C. De Carvalho, and A. A. Freitas. 2015. Automatic Design of Decision-tree Induction Algorithms. Springer.Google ScholarGoogle Scholar
  11. M. Ben-Bassat. 1978. F-entropies, probability of error, and feature selection. Inf. Contr. 39, 3 (1978), 227--242.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Ben-Bassat. 1982. 35 Use of Distance Measures, Information Measures and Error Bounds in Feature Evaluation. Vol. 2. Elsevier, 773--791.Google ScholarGoogle Scholar
  13. M. Ben-Bassat and J. Raviv. 1978. Renyi’s entropy and the probability of error. IEEE Trans. Inf. Theor. 24, 3 (1978), 324--331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Benavoli, G. Corani, J. Demsar, and M. Zaffalon. 2017. Time for a change: A tutorial for comparing multiple classifiers through Bayesian analysis. J. Mach. Learn. Res. 18, 1 (2017), 2653--2688.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Bohanec and I. Bratko. 1994. Trading accuracy for simplicity in decision trees. Mach. Learn. 15, 3 (1994), 223--250.Google ScholarGoogle ScholarCross RefCross Ref
  16. L. Breiman, J. Friedman, R. Olshen, and C. Stone.1984. Classification and Regression Trees. Routledge.Google ScholarGoogle Scholar
  17. W. Buntine and T. Niblett. 1992. A further comparison of splitting rules for decision-tree induction. Mach. Learn. 8, 1 (1992), 75--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. L. Buntine. 1990. A Theory of Learning Classification Rules. Doctoral dissertation. University of Technology, Sydney.Google ScholarGoogle Scholar
  19. L. Cañete Sifuentes, R. Monroy, M. A. Medina-Pérez, O. Loyola-González, and F. Vera Voronisky. 2019. Classification based on multivariate contrast patterns. IEEE Access 7 (2019), 55744--55762.Google ScholarGoogle ScholarCross RefCross Ref
  20. R. Carbonneau, K. Laframboise, and R. Vahidov. 2008. Application of machine learning techniques for supply chain demand forecasting. Eur. J. Oper. Res. 184, 3 (2008), 1140--1154.Google ScholarGoogle ScholarCross RefCross Ref
  21. J. Carrasco, S. García, M. M. Rueda, S. Das, and F. Herrera. 2020. Recent trends in the use of statistical tests for comparing swarm and evolutionary computing algorithms: Practical guidelines and a critical review. Swarm Evolut. Comput. 54 (May 2020), 100665.Google ScholarGoogle Scholar
  22. L. M. Cañete Sifuentes. 2018. Mining Contrast Patterns from Multivariate Decision Trees. Master’s thesis. Instituto Tecnologico y de Estudios Superiores de Monterrey.Google ScholarGoogle Scholar
  23. B. Chandra, R. Kothari, and P. Paul. 2010. A new node splitting measure for decision tree construction. Pattern Recog. 43, 8 (2010), 2725--2731.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Chandra and V. B. Kuppili. 2011. Heterogeneous node split measure for decision tree construction. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. 872--877.Google ScholarGoogle Scholar
  25. L. Chang, M. M. Duarte, L. E. Sucar, and E. F. Morales. 2012. A Bayesian approach for object classification based on clusters of SIFT local features. Exp. Syst. Applic. 39, 2 (2012), 1679--1686.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Chao and W. Junzheng. 2018. Cloud-service decision tree classification for education platform. Cog. Syst. Res. 52 (2018), 234--239.Google ScholarGoogle ScholarCross RefCross Ref
  27. K. Cheng, T. Fan, Y. Jin, Y. Liu, T. Chen, and Q. Yang. 2019. SecureBoost: A Lossless Federated Learning Framework. arxiv:1901.08755 (2019).Google ScholarGoogle Scholar
  28. D. A. Cieslak and N. V. Chawla. 2008. Learning decision trees for unbalanced data. In Machine Learning and Knowledge Discovery in Databases, Walter Daelemans, Bart Goethals, and Katharina Morik (Eds.). Springer Berlin, 241--256.Google ScholarGoogle Scholar
  29. D. A. Cieslak, T. R. Hoens, N. V. Chawla, and W. P. Kegelmeyer. 2012. Hellinger distance decision trees are robust and skew-insensitive. Data Mining. Knowl. Discov. 24, 1 (2012), 136--158.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Daroczy. 1970. Generalized information functions. Inf. Contr. 16, 1 (1970), 36--51.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. Demsar. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, Jan. (2006), 1--30.Google ScholarGoogle Scholar
  32. F. Denis, R. Gilleron, and F. Letouzey. 2005. Learning from positive and unlabeled examples. Theor. Comput. Sci. 348, 1 (2005), 70--83.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. Deradjat and T. Minshall. 2018. Decision trees for implementing rapid manufacturing for mass customisation. CIRP J. Manuf. Sci. Technol. 23 (2018), 156--171.Google ScholarGoogle ScholarCross RefCross Ref
  34. P. A. Devijver. 1974. On a new class of bounds on Bayes Risk in multihypothesis pattern recognition. IEEE Trans. Comput. C-23, 1 (1974), 70--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. T. Dietterich, M. Kearns, and Y. Mansour. 1996. Applying the weak learning framework to understand and improve C4. 5.Google ScholarGoogle Scholar
  36. T. Elomaa and J. Rousu. 1999. General and efficient multisplitting of numerical attributes. Mach. Learn. 36, 3 (1999), 201--244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. U. M. Fayyad and K. B. Irani. 1992. Attribute selection problem in decision tree generation. In Proceedings of the 10th National Conference on Artificial Intelligence. 104--110.Google ScholarGoogle Scholar
  38. D. Fisher. 1996. Pessimistic and Optimistic Induction. Technical report CS-92-12. Department of Computer Science, Vanderbilt University, Nashville.Google ScholarGoogle Scholar
  39. D. Fournier and B. Crémilleux. 2002. A quality index for decision tree pruning. Knowl.-based Syst. 15, 1 (2002), 37--43.Google ScholarGoogle Scholar
  40. M. Galar, A. Fernández, E. Barrenechea, H. Bustince, and F. Herrera. 2016. Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets. Inf. Sci. 354 (2016), 178--196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Gashler, C. Giraud-Carrier, and T. Martinez. 2008. Decision tree ensemble: Small heterogeneous is better than large homogeneous. In Proceedings of the 7th International Conference on Machine Learning and Applications. 900--905.Google ScholarGoogle Scholar
  42. R. Gonzalez Perea, E. Camacho Poyato, P. Montesinos, and J. A. Rodriguez Díaz. 2019. Prediction of irrigation event occurrence at farm level using optimal decision trees. Comput. Electron. Agricult. 157 (2019), 173--180.Google ScholarGoogle ScholarCross RefCross Ref
  43. K. Grabczewski. 2014. Meta-learning in Decision Tree Induction. Vol. 1. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Erdem Günay, Lemi Türker, and N. Alper Tapan. 2018. Decision tree analysis for efficient CO2 utilization in electrochemical systems. J. CO2 Utiliz. 28 (2018), 83--95.Google ScholarGoogle Scholar
  45. A. Hart. 1984. Experience in the use of an inductive system in knowledge engineering. In Research Development in Expert Systems. Cambridge University Press, Cambridge, UK, 121--129.Google ScholarGoogle Scholar
  46. A. Joshi. 1964. A note on a certain theorem stated by Kullback. IEEE Trans. Inf. Theor. 10, 1 (1964), 93--94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. B. H. Jun, C. S. Kim, H. Song, and J. Kim. 1997. A new criterion in selection and discretization of attributes for the generation of decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 19, 12 (1997), 1371--1375.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. G. Kalkanis. 1993. The application of confidence interval error analysis to the design of decision tree classifiers. Pattern Recog. Lett. 14, 5 (1993), 355--361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. G. V. Kass. 1980. An exploratory technique for investigating large quantities of categorical data. J. Roy. Stat. Soc. Series C (Appl. Stat.) 29, 2 (1980), 119--127.Google ScholarGoogle ScholarCross RefCross Ref
  50. S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas. 2006. Machine learning: A review of classification and combining techniques. Artif. Intell. Rev. 26, 3 (2006), 159--190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. J. K. Kruschke and T. M. Liddell. 2015. The Bayesian new statistics: Two historical trends converge. SSRN Electron. J. 2 (2015), 1--53.Google ScholarGoogle Scholar
  52. C. Kuzey, A. S. Karaman, and E. Akman. 2019. Elucidating the impact of visa regimes: A decision tree analysis. Tour. Manag. Perspect. 29 (2019), 148--156.Google ScholarGoogle ScholarCross RefCross Ref
  53. E. S. Laber and F. de A. Mello Pereira. 2018. Splitting criteria for classification problems with multi-valued attributes and large number of classes. Pattern Recog. Lett. 111 (2018), 58--63.Google ScholarGoogle ScholarCross RefCross Ref
  54. Q. Li, Z. Wen, and B. He. 2019. Practical Federated Gradient Boosting Decision Trees. arxiv:1911.04206 (2019).Google ScholarGoogle Scholar
  55. T. Lissack and Fu King-Sun. 1976. Error estimation in pattern recognition via L-distance between posterior density functions. IEEE Trans. Inf. Theor. 22, 1 (1976), 34--45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. W. Liu, S. Chawla, D. Cieslak, and N. Chawla. 2010. A Robust Decision Tree Algorithm for Imbalanced Data Sets. Society for Industrial and Applied Mathematics, 766--777.Google ScholarGoogle Scholar
  57. W. Z. Liu and A. P. White. 1994. The importance of attribute selection measures in decision tree induction. Mach. Learn. 15, 1 (1994), 25--41.Google ScholarGoogle ScholarCross RefCross Ref
  58. O. Loyola-González. 2019. Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. IEEE Access 7, 1 (2019), 154096--154113.Google ScholarGoogle ScholarCross RefCross Ref
  59. O. Loyola-González. 2019. Understanding the criminal behavior in Mexico City through an explainable artificial intelligence model. In Advances in Soft Computing, Lourdes Martínez-Villaseñor, Ildar Batyrshin, and Antonio Marín-Hernández (Eds.). Springer International Publishing, Cham, 136--149.Google ScholarGoogle Scholar
  60. O. Loyola-González, A. E. Gutierrez-Rodríguez, M. A. Medina-Pérez, R. Monroy, J. F. Martínez-Trinidad, J. A. Carrasco-Ochoa, and M. García-Borroto. 2020. An explainable artificial intelligence model for clustering numerical databases. IEEE Access 8 (2020), 52370--52384.Google ScholarGoogle ScholarCross RefCross Ref
  61. O. Loyola-González, A. López-Cuevas, M. A. Medina-Pérez, B. Camiña, J. E. Ramírez-Márquez, and R. Monroy. 2019. Fusing pattern discovery and visual analytics approaches in tweet propagation. Inf. Fusion 46 (2019), 91--101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. O. Loyola-Gonzalez, M. A. Medina-Perez, and M. Garcia-Borroto. 2015. Inducing decision trees based on a cluster quality index. IEEE Latin Amer. Trans. 13, 4 (2015), 1141--1147.Google ScholarGoogle ScholarCross RefCross Ref
  63. O. Loyola-González, M. A. Medina-Pérez, J. F. Martínez-Trinidad, J. A. Carrasco-Ochoa, R. Monroy, and M. García-Borroto. 2017. PBC4cip: A new contrast pattern-based classifier for class imbalance problems. Knowl.-based Syst. 115 (2017), 100--109.Google ScholarGoogle Scholar
  64. R. López De Mántaras. 1991. A distance-based attribute selection measure for decision tree induction. Mach. Learn. 6, 1 (1991), 81--92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. R. J. Marshall. 1986. Partitioning methods for classification and decision making in medicine. Stat. Med. 5, 5 (1986), 517--526.Google ScholarGoogle ScholarCross RefCross Ref
  66. J. K. Martin. 1997. An exact probability metric for decision tree splitting and stopping. Mach. Learn. 28, 2 (1997), 257--291.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. T. Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 267 (2019), 1--38.Google ScholarGoogle ScholarCross RefCross Ref
  68. J. Mingers. 1986. Expert systems-experiments with rule induction. J. Oper. Res. Soc. 37, 11 (1986), 1031--1037.Google ScholarGoogle Scholar
  69. J. Mingers. 1986. Inducing rules for expert systems-statistical aspects. Prof. Stat. 5, 7 (1986), 19--24.Google ScholarGoogle Scholar
  70. J. Mingers. 1987. Expert systems—Rule induction with statistical data. J. Oper. Res. Soc. 38, 1 (1987), 39--47.Google ScholarGoogle Scholar
  71. J. Mingers. 1989. An empirical comparison of selection measures for decision-tree induction. Mach. Learn. 3, 4 (1989), 319--342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. T. M. Mitchell. 1997. Mach. Learn. Vol. 45. 870--877.Google ScholarGoogle Scholar
  73. J. G. Moreno-Torres, J. A. Saez, and F. Herrera. 2012. Study on the impact of partition-induced dataset. IEEE Trans. Neural Netw. Learn. Syst. 23, 8 (2012), 1304--1312.Google ScholarGoogle ScholarCross RefCross Ref
  74. C. Nadeau and Y. Bengio. 2003. Inference for the generalization error. Mach. Learn. 52, 3 (2003), 239--281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. T. Niblett and I. Bratko. 1987. Learning decision rules in noisy domains. In Proceedings of Expert Systems’86, the 6th Annual Technical Conference on Research and Development in Expert Systems III. Cambridge University Press, New York, NY, 25--34.Google ScholarGoogle Scholar
  76. R. Nock and W. Henecka. 2020. Boosted and Differentially Private Ensembles of Decision Trees. arxiv:2001.09384 (2020).Google ScholarGoogle Scholar
  77. B. Omar, G. C. Daniel, B. Zineb, and C. J. Aida. 2018. A comparative study of machine learning algorithms for financial data prediction. In Proceedings of the International Symposium on Advanced Electrical and Communication Technologies (ISAECT’18). 1--5.Google ScholarGoogle Scholar
  78. A. E. Permanasari and A. Nurlayli. 2017. Decision tree to analyze the cardiotocogram data for fetal distress determination. In Proceedings of the International Conference on Sustainable Information Engineering and Technology (SIET’17). 459--463.Google ScholarGoogle Scholar
  79. J. R. Quinlan. 1986. Induction of decision trees. Mach. Learn. 1, 1 (1986), 81--106.Google ScholarGoogle ScholarCross RefCross Ref
  80. J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. J. R. Quinlan and R. L. Rivest. 1989. Inferring decision trees using the minimum description length principle. Inf. Comput. 80, 3 (1989), 227--248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. L. Rokach. 2016. Decision forest: Twenty years of research. Inf. Fusion 27 (2016), 111--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. L. Rokach and O. Maimon. 2014. Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing Co., Inc.Google ScholarGoogle Scholar
  84. E. M. Rounds. 1980. A combined nonparametric approach to feature selection and binary decision tree design. Pattern Recog. 12, 5 (1980), 313--317.Google ScholarGoogle ScholarCross RefCross Ref
  85. C. Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 5 (2019), 206--215.Google ScholarGoogle ScholarCross RefCross Ref
  86. H. F. Ryan. 1968. The information content measure as a performance criterion for feature selection. In Proceedings of the 7th Symposium on Adaptive Processes. 23--23.Google ScholarGoogle ScholarCross RefCross Ref
  87. S. R. Safavian and D. Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man, Cyber. 21, 3 (1991), 660--674.Google ScholarGoogle ScholarCross RefCross Ref
  88. O. Sagi and L. Rokach. 2020. Explainable decision forest: Transforming a decision forest into an interpretable tree. Inf. Fusion 61 (2020), 124--138.Google ScholarGoogle ScholarCross RefCross Ref
  89. G. Santafe, I. Inza, and J. A. Lozano. 2015. Dealing with the evaluation of supervised classification algorithms. Artif. Intell. Rev. 44, 4 (2015), 467--508.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. C. Su, S. Ju, Y. Liu, and Z. Yu. 2014. An empirical study of skew-insensitive splitting criteria and its application in traditional Chinese medicine. Intell. Autom. Soft Comput. 20, 4 (2014), 535--554.Google ScholarGoogle ScholarCross RefCross Ref
  91. G. T. Toussaint. 1972. Feature evaluation with quadratic mutual information. Inf. Proc. Lett. 1, 4 (1972), 153--156.Google ScholarGoogle ScholarCross RefCross Ref
  92. G. T. Toussaint. 1978. Probability of error, expected divergence, and the affinity of several distributions. IEEE Trans. Syst. Man, Cyber. 8, 6 (1978), 482--485.Google ScholarGoogle ScholarCross RefCross Ref
  93. L. A. Trejo, V. Ferman, F. M. Arredondo Giacinti, M. A. Medina-Pérez, R. Monroy, and J. E. Ramírez-Márquez. 2019. DNS-ADVP: A machine learning anomaly detection and visual platform to protect top-level domain name servers against DDoS attacks. IEEE Access 7 (2019), 116358--116369.Google ScholarGoogle ScholarCross RefCross Ref
  94. I. Triguero, S. González, J. M. Moyano, S. García, J. Alcalá-Fdez, J. Luengo, A. Fernández, M. J. Del Jesús, L. Sánchez, and F. Herrera. 2017. KEEL 3.0: An open source software for multi-stage analysis in data mining. Int. J. Comput. Intell. Syst. 10, 1 (2017), 1238--1249.Google ScholarGoogle ScholarCross RefCross Ref
  95. A. Utku, I. A. Dogru, and M. A. Akcayol. 2018. Decision tree based Android malware detection system. In Proceedings of the 26th Signal Processing and Communications Applications Conference (SIU’18). 1--4.Google ScholarGoogle Scholar
  96. J. N. van Rijn, Geoffrey Holmes, B. Pfahringer, and J. Vanschoren. 2018. The online performance estimation framework: Heterogeneous ensemble learning for data streams. Mach. Learn. 107, 1 (2018), 149--176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. T. R. Vilmansen. 1972. On dependence and discrimination in pattern recognition. IEEE Trans. Comput. C-21, 9 (1972), 1029--1031.Google ScholarGoogle ScholarCross RefCross Ref
  98. T. R. Vilmansen. 1973. Feature evalution with measures of probabilistic dependence. IEEE Trans. Comput. C-22, 4 (1973), 381--388.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Y. Wang and S. Xia. 2017. Unifying attribute splitting criteria of decision trees by Tsallis entropy. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’17). 2507--2511.Google ScholarGoogle Scholar
  100. W. J. Wilbur and K. Sirotkin. 1992. The automatic identification of stop words. J. Inf. Sci. 18, 1 (1992), 45--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal. 2016. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Y. Yang and J. O. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning. 412--420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. H. Zhang, Y. Song, B. Jiang, B. Chen, and G. Shan. 2019. Two-stage bagging pruning for reducing the ensemble size and improving the classification performance. Math. Prob. Eng. 2019 (2019).Google ScholarGoogle Scholar

Index Terms

  1. A Practical Tutorial for Decision Tree Induction: Evaluation Measures for Candidate Splits and Opportunities

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 54, Issue 1
      January 2022
      844 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/3446641
      Issue’s Table of Contents

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 January 2021
      • Accepted: 1 October 2020
      • Revised: 1 May 2020
      • Received: 1 November 2019
      Published in csur Volume 54, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format