Loading [a11y]/accessibility-menu.js
HSDD: A hybrid sampling strategy for class imbalance in defect prediction data sets | IEEE Conference Publication | IEEE Xplore

HSDD: A hybrid sampling strategy for class imbalance in defect prediction data sets


Abstract:

Class imbalance is a common problem in defect prediction data sets. In order to cope with this problem, over-sampling and under sampling methods are employed. However, th...Show More

Abstract:

Class imbalance is a common problem in defect prediction data sets. In order to cope with this problem, over-sampling and under sampling methods are employed. However, these methods are designed for instance based alteration and not specialized for feature space. Also there is not any distinctive approach to cope with class imbalance in defect prediction data sets. We develop HSDD (hybrid sampling for defect data sets) to solve this problem. HSDD comprises not only derivation of low-level metrics, but also reduction processes of repeated data points. The method was evaluated on industrial and open source project data sets by using Bayes, naive Bayes, random forest, and J48 in terms of g-mean and training time. Obtained results show that HSDD produces promising training performance especially in large-scale data sets.
Date of Conference: 19-21 September 2016
Date Added to IEEE Xplore: 26 January 2017
ISBN Information:
Conference Location: Porto, Portugal

References

References is not available for this document.