Loading [a11y]/accessibility-menu.js
Applying active learning strategy to classify large scale data with imbalanced classes | IEEE Conference Publication | IEEE Xplore

Applying active learning strategy to classify large scale data with imbalanced classes


Abstract:

Nowadays, classification tasks are very challenging because data is usually large and imbalanced. They can cause low prediction accuracy and high computation costs. Activ...Show More

Abstract:

Nowadays, classification tasks are very challenging because data is usually large and imbalanced. They can cause low prediction accuracy and high computation costs. Active Learning is a technique that employs only a small set of data to construct an initial classification model. Then, it iteratively improves the model by incrementally learning from the misclassified examples. In this paper, we aim to improve prediction accuracy by applying Active Learning. To solve the imbalance issue, the active model was iteratively updated based on the G-mean, and the under sampling sampling was also applied. The proposed algorithm was suitable for large scale data since it did not need to use the whole data set to construct a model. The experiment was conducted on two standard corpuses, one of which contained more than 100,000 examples. The result showed that a prediction performance of standard technique (Neural Network) can be improved by applying the Active Learning strategy for 5%-13%. Furthermore, this technique also outperformed other classical classification algorithms including K-nearest neighbors (kNN), Support Vector Machine (SVM), Decision Tree (DT), Naïve Bayes (NB) and Artificial Neural Network (ANN).
Date of Conference: 27-29 October 2016
Date Added to IEEE Xplore: 19 January 2017
ISBN Information:
Conference Location: Ansan, Korea (South)

Contact IEEE to Subscribe

References

References is not available for this document.