Conferences >2022 56th Asilomar Conference...

Data Shapley Valuation for Efficient Batch Active Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Annotating the right set of data amongst all available data points is a key challenge in many machine learning applications. Batch active learning is a popular approach t...Show More

Notes: This article was mistakenly omitted from the original submission to IEEE Xplore. It is now included as part of the conference record.

Metadata

Abstract:

Annotating the right set of data amongst all available data points is a key challenge in many machine learning applications. Batch active learning is a popular approach to address this, in which batches of unlabeled data points are selected for annotation, while an underlying learning algorithm gets subsequently updated. In this work, we introduce Active Data Shapley (ADS) –a filtering layer for batch active learning that significantly increases the efficiency of existing active learning algorithms by pre-selecting, using a linear time computation, the highest-value points from an unlabeled dataset. Using the notion of the Shapley value of data, our method estimates the value of unlabeled data points with regards to the prediction task at hand. We show that ADS is particularly effective when the pool of unlabeled data exhibits real-world caveats: noise, heterogeneity, and domain shift. We run experiments demonstrating that when ADS is used to pre-select the highest-ranking portion of an unlabeled dataset, the efficiency of state-of-the-art batch active learning methods increases by an average factor of 6x, while preserving performance effectiveness.

Notes: This article was mistakenly omitted from the original submission to IEEE Xplore. It is now included as part of the conference record.

Published in: 2022 56th Asilomar Conference on Signals, Systems, and Computers

Date of Conference: 31 October 2022 - 02 November 2022

Date Added to IEEE Xplore: 10 March 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/IEEECONF56349.2022.10064696

Conference Location: Pacific Grove, CA, USA

Contents

References is not available for this document.

Data Shapley Valuation for Efficient Batch Active Learning

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Data Shapley Valuation for Efficient Batch Active Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?