Statistical Entropy Measures in C4.5 Trees

Aldo Ramirez Arellano, Juan Bory-Reyes, Luis Manuel Hernandez-Simon

Source Title: International Journal of Data Warehousing and Mining (IJDWM)14(1)

ISSN: 1548-3924|EISSN: 1548-3932|EISBN13: 9781522542643|DOI: 10.4018/IJDWM.2018010101

MLA

Arellano, Aldo Ramirez, et al. "Statistical Entropy Measures in C4.5 Trees." IJDWM vol.14, no.1 2018: pp.1-14. http://doi.org/10.4018/IJDWM.2018010101

APA

Arellano, A. R., Bory-Reyes, J., & Hernandez-Simon, L. M. (2018). Statistical Entropy Measures in C4.5 Trees. International Journal of Data Warehousing and Mining (IJDWM), 14(1), 1-14. http://doi.org/10.4018/IJDWM.2018010101

Chicago

Arellano, Aldo Ramirez, Juan Bory-Reyes, and Luis Manuel Hernandez-Simon. "Statistical Entropy Measures in C4.5 Trees," International Journal of Data Warehousing and Mining (IJDWM) 14, no.1: 1-14. http://doi.org/10.4018/IJDWM.2018010101

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

The main goal of this article is to present a statistical study of decision tree learning algorithms based on the measures of different parametric entropies. Partial empirical evidence is presented to support the conjecture that the parameter adjusting of different entropy measures might bias the classification. Here, the receiver operating characteristic (ROC) curve analysis, precisely, the area under the ROC curve (AURC) gives the best criterion to evaluate decision trees based on parametric entropies. The authors emphasize that the improvement of the AURC relies on of the type of each dataset. The results support the hypothesis that parametric algorithms are useful for datasets with numeric and nominal, but not for mixed, attributes; thus, four hybrid approaches are proposed. The hybrid algorithm, which is based on Renyi entropy, is suitable for nominal, numeric, and mixed datasets. Moreover, it requires less time when the number of nodes is reduced, when the AURC is maintaining or increasing, thus it is preferable in large datasets.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Statistical Entropy Measures in C4.5 Trees

MLA

APA

Chicago

Export Reference

Abstract

Request Access