Active Learning for kNN Using Instance Impact

Qayyumi, Sayed Waleed; Park, Laurence A. F.; Obst, Oliver

doi:10.1007/978-3-031-22695-3_29

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13728))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

1425 Accesses

Abstract

Labelling unlabeled data is a time-consuming and expensive process. Labelling initiatives should select samples that are likely to enhance the classification accuracy of the classifier. Several methods can be employed to accomplish this goal. One of these techniques is to select samples with the highest level of uncertainty in their predicted labels. Experts then label these samples. Another option is to choose samples at random. This paper proposes three methods for identifying unlabeled samples to improve predictive accuracy when they are labelled. Our study explores how to select samples when we have very few labelled samples available from manifold distributed data sets. In order to assess performance, we have compared our approaches with uncertainty sampling and random sampling. We demonstrate that our methods outperform uncertainty sampling and random sampling by using public and real-world data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altmann, J.: Observational study of behavior: sampling methods. Behaviour 49(3–4), 227–266 (1974)
Article Google Scholar
Bestmann, S., et al.: Influence of uncertainty and surprise on human corticospinal excitability during preparation for action. Curr. Biol. 18(10), 775–780 (2008)
Article Google Scholar
Cao, L., Zhu, C.: Personalized next-best action recommendation with multi-party interaction learning for automated decision-making. arXiv preprint arXiv:2108.08846 (2021)
Dunn, E., Frahm, J.M.: Next best view planning for active model improvement. In: BMVC, pp. 1–11 (2009)
Google Scholar
Etikan, I., Bala, K.: Sampling and sampling methods. Biomet. Biostatist. Int. J. 5(6), 00149 (2017)
Google Scholar
Fraboni, Y., Vidal, R., Kameni, L., Lorenzi, M.: Clustered sampling: low-variance and improved representativity for clients selection in federated learning. arXiv preprint arXiv:2105.05883 (2021)
Giraud-Carrier, C.: A note on the utility of incremental learning. AI Commun. 13(4), 215–223 (2000)
MATH Google Scholar
Goodman, L.A.: Snowball sampling. Ann. Math. Statist. 32, 148–170 (1961)
Article MATH Google Scholar
Jenkinson, A.: What happened to strategic segmentation? J. Direct Data Digit. Mark. Pract. 11(2), 124–139 (2009)
Article Google Scholar
Kramer-Schadt, S., et al.: The importance of correcting for sampling bias in maxent species distribution models. Divers. Distrib. 19(11), 1366–1379 (2013)
Google Scholar
Lughofer, E.: Hybrid active learning for reducing the annotation effort of operators in classification systems. Pattern Recogn. 45(2), 884–896 (2012)
Article Google Scholar
Madow, W.G., Madow, L.H.: On the theory of systematic sampling, I. Ann. Math. Stat. 15(1), 1–24 (1944)
Article MATH Google Scholar
Moser, C.A.: Quota sampling. J. R. Statist. Soc. Ser. A (General) 115(3), 411–423 (1952)
Article Google Scholar
Neyman, J.: On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics, pp. 123–150. Springer Series in Statistics. Springer, New York, NY (1992). https://doi.org/10.1007/978-1-4612-4380-9_12
Olken, F.: Random sampling from databases. Ph.D. thesis, University of California, Berkeley (1993)
Google Scholar
Rubens, N., Kaplan, D., Sugiyama, M.: Active learning in recommender systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 735–767. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-85820-3_23
Chapter Google Scholar
Sedgwick, P.: Convenience sampling. BMJ. 347, 1–2 (2013)
Article Google Scholar
Settles, B.: Active learning literature survey (2009)
Google Scholar
Shi, W., Gong, Y., Ding, C., Ma, Z., Tao, X., Zheng, N.: Transductive semi-supervised deep learning using min-max features. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 311–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_19
Chapter Google Scholar
Syed, N.A., Liu, H., Sung, K.K.: Incremental learning with support vector machines (1999)
Google Scholar
Syfert, M.M., Smith, M.J., Coomes, D.A.: The effects of sampling bias and model complexity on the predictive performance of maxent species distribution models. PLoS ONE 8(2), e55158 (2013)
Article Google Scholar
Tokdar, S.T., Kass, R.E.: Importance sampling: a review. Wiley Interdiscipl. Rev. Comput. Statist. 2(1), 54–60 (2010)
Article Google Scholar
Van Amersfoort, J., Smith, L., Teh, Y.W., Gal, Y.: Uncertainty estimation using a single deep deterministic neural network. In: International Conference on Machine Learning, pp. 9690–9700. PMLR (2020)
Google Scholar
Yang, B., Sun, J.T., Wang, T., Chen, Z.: Effective multi-label active learning for text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 917–926 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Research in Mathematics and Data Science School of Computer, Data and Mathematical Sciences, Western Sydney University, Locked Bag 1797, Penrith, NSW, 2751, Australia
Sayed Waleed Qayyumi, Laurence A. F. Park & Oliver Obst

Authors

Sayed Waleed Qayyumi
View author publications
You can also search for this author in PubMed Google Scholar
Laurence A. F. Park
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Obst
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sayed Waleed Qayyumi .

Editor information

Editors and Affiliations

University of New South Wales, Sydney, NSW, Australia
Haris Aziz
University of Western Australia, Perth, WA, Australia
Débora Corrêa
University of Western Australia, Perth, WA, Australia
Tim French

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qayyumi, S.W., Park, L.A.F., Obst, O. (2022). Active Learning for kNN Using Instance Impact. In: Aziz, H., Corrêa, D., French, T. (eds) AI 2022: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13728. Springer, Cham. https://doi.org/10.1007/978-3-031-22695-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-22695-3_29
Published: 03 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22694-6
Online ISBN: 978-3-031-22695-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Active Learning for kNN Using Instance Impact