Abstract:
Quantification research has sought to accurately estimate class distributions under dataset shift. While existing methods perform well under assumed conditions of shift, ...Show MoreMetadata
Abstract:
Quantification research has sought to accurately estimate class distributions under dataset shift. While existing methods perform well under assumed conditions of shift, it is not always clear whether such assumptions will hold in a given application. This work extends the analysis and experimental evaluation of our Gain-Some-Lose-Some (GSLS) model for quantification under general dataset shift and incorporates it into a method for dynamically selecting the most appropriate quantification method. Selection by a Kolmogorov-Smirnov test for any shift followed by a newly proposed “Adjusted Kolmogorov-Smirnov” test for non-prior shift is found to best balance quantification and runtime performance. We also present a framework for constraining quantification prediction intervals to user-specified limits by requesting a smaller set of instance class labels from the user than required with confidence-based rejection.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 36, Issue: 7, July 2024)