Abstract:
Dataset drift is a common challenge in machine learning, especially for models trained on unstructured data, such as images. In this article, we propose a new approach fo...Show MoreMetadata
Abstract:
Dataset drift is a common challenge in machine learning, especially for models trained on unstructured data, such as images. In this article, we propose a new approach for the detection of data drift in black-box models, which is based on Hellinger distance and feature extraction methods. The proposed approach is aimed at detecting data drift without knowing the architecture of the model to monitor, the dataset on which it was trained, or both. The article analyzes three different use cases to evaluate the effectiveness of the proposed approach, encompassing a variety of tasks including document segmentation, classification, and handwriting recognition. The use cases considered for the drift are adversarial assaults, domain shifts, and dataset biases. The experimental results show the efficacy of our drift detection approach in identifying changes in distribution under various training settings.
Published in: IT Professional ( Volume: 26, Issue: 2, March-April 2024)