Loading [MathJax]/extensions/MathMenu.js
Drift detection in data stream classification without fully labelled instances | IEEE Conference Publication | IEEE Xplore

Drift detection in data stream classification without fully labelled instances


Abstract:

Drift detection is an important issue in classification-based stream mining in order to be able to inform the operators in case of unintended changes in the system. Usual...Show More

Abstract:

Drift detection is an important issue in classification-based stream mining in order to be able to inform the operators in case of unintended changes in the system. Usually, current detection approaches rely on the assumption to have fully supervised labeled streams available, which is often a quite unrealistic scenario in on-line real-world applications. We propose two ways to improve economy and applicability of drift detection: 1.) a semi-supervised approach employing single-pass active learning filters for selecting the most interesting samples for supervising the performance of classifiers and 2.) a fully unsupervised approach based on the overlap degree of classifier's output certainty distributions. Both variants rely on a modified version of the Page-Hinkley test, where a fading factor is introduced to outweigh older samples, making it more flexible to detect successive drift occurrences in a stream. The approaches are compared with the fully supervised variant (SoA) on two real-world on-line applications: the semi-supervised approach is able to detect three real-occurring drifts in these streams with an even lower than resp. the same delay as the supervised variant of about 200 (versus 300) resp. 70 samples, and this by requiring only 20% labelled samples.
Date of Conference: 01-03 December 2015
Date Added to IEEE Xplore: 04 January 2016
ISBN Information:
Conference Location: Douai, France

Contact IEEE to Subscribe

References

References is not available for this document.