Abstract:
Nonstationary environments, where underlying distributions change over time, are becoming increasingly common in real-world applications. A specific example of such an en...Show MoreMetadata
Abstract:
Nonstationary environments, where underlying distributions change over time, are becoming increasingly common in real-world applications. A specific example of such an environment is concept drift, where the joint probability distributions of observed data drift over time. Such environments call for a model that can update its parameters to adapt to the changing environment. An extreme case of this scenario, referred to as extreme verification latency, is where labeled data are only available at initialization, with unlabeled data becoming available in a streaming fashion thereafter. In such a scenario, the classifier must update its hypothesis based on only unlabeled data drawn from the drifting distributions. In our prior work, we described a framework, called COMPOSE, that works well in this type of environment, provided that the data distributions experience limited (or gradual) drift. Limited drift assumption is common in many concept drift algorithms yet - surprisingly - there is little or no formal definition of this assumption. In this contribution, we describe a mechanism to formally quantify limited drift. We define two metrics, one that represents the normalized class separation drift, and the other that uses the ratio of between-class separations and within class drift through time. We test these metrics on both synthetic and real world problems, and argue that the latter can be more suitably used.
Date of Conference: 12-17 July 2015
Date Added to IEEE Xplore: 01 October 2015
ISBN Information: