Abstract:
A decision tree is one of most popular classifiers that classifies a balanced data set effectively. For an imbalanced data set, a standard decision tree tends to misclass...Show MoreMetadata
Abstract:
A decision tree is one of most popular classifiers that classifies a balanced data set effectively. For an imbalanced data set, a standard decision tree tends to misclassify instances of a class having tiny number of samples. In this paper, we modify the decision tree induction algorithm by performing a ternary split on continuous-valued attributes focusing on distribution of minority class instances. The algorithm uses the minority variance to rank candidates of the high gain ratio, then it chooses the candidate with the minimum minority entropy. From our experiments with data sets from UCI and Statlog repository, this method achieves the better performance comparing with C4.5 using only gain ratio for imbalanced data sets.
Date of Conference: 26-28 July 2011
Date Added to IEEE Xplore: 15 September 2011
ISBN Information: