Skip to main content

Evaluation of Active Learning Techniques on Medical Image Classification with Unbalanced Data Distributions

  • Conference paper
  • First Online:
Deep Generative Models, and Data Augmentation, Labelling, and Imperfections (DGM4MICCAI 2021, DALI 2021)

Abstract

In supervised image classification, convolutional deep neural networks have become the dominant methodology showing excellent performance in a number of tasks. These models typically require a very large number of labelled data samples to achieve required performance and generalisability. While data acquisition is relatively easy, data labelling, particularly in the case of medical imaging where expertise is required, is expensive. This has led to the investigation of active learning methods to improve the effectiveness of choosing which data should be prioritised for labelling. While new algorithms and methodologies continue to be introduced for active learning, each reporting improved performance, one key aspect that can be overlooked is the underlying data distribution of the dataset. Many active learning papers are benchmarked using curated datasets with balanced class distributions. This is not representative of many real-world scenarios where the data acquired can be heavily skewed towards a certain class. In this paper, we evaluate the performance of several established active learning techniques on an unbalanced dataset of 15153 chest X-Ray images, forming a more realistic scenario. This paper shows that the unbalanced dataset has a significant impact on the performance of certain algorithms, and should be considered when choosing which active learning strategy to implement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Budd, S., Robinson, E.C., Kainz, B.: A survey on active learning and human-in-the-loop deep learning for medical image analysis. Medical Image Analysis, p. 102062 (2021)

    Google Scholar 

  2. Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115–118 (2017)

    Article  Google Scholar 

  3. Gulshan, V., et al.: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316(22), 2402–2410 (2016)

    Article  Google Scholar 

  4. Haussmann, E., et al.: Scalable active learning for object detection. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1430–1435. IEEE (2020)

    Google Scholar 

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  6. Huang, J., Child, R., Rao, V., Liu, H., Satheesh, S., Coates, A.: Active learning for speech recognition: the power of gradients. arXiv:1612.03226 (2016)

  7. Larrazabal, A.J., Nieto, N., Peterson, V., Milone, D.H., Ferrante, E.: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl. Acad. Sci. 117(23), 12592–12594 (2020)

    Article  Google Scholar 

  8. Lee, C.S., Baughman, D.M., Lee, A.Y.: Deep learning is effective for classifying normal versus age-related macular degeneration oct images. Ophthalmol. Retina 1(4), 322–327 (2017)

    Article  Google Scholar 

  9. Lindenbaum, M., Markovitch, S., Rusakov, D.: Selective sampling for nearest neighbor classifiers. Mach. Learn. 54(2), 125–152 (2004)

    Article  Google Scholar 

  10. Massion, P.P., et al.: Assessing the accuracy of a deep learning method to risk stratify indeterminate pulmonary nodules. Am. J. Respir. Crit. Care Med. 202(2), 241–249 (2020)

    Article  Google Scholar 

  11. Rahman, T., et al.: Exploring the effect of image enhancement techniques on COVID-19 detection using chest x-ray images. Comput. Biol. Med. 132, 104319 (2021)

    Article  Google Scholar 

  12. Sener, O., Savarese, S.: Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489 (2017)

  13. Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2009)

    Google Scholar 

  14. Tran, T., Do, T.T., Reid, I., Carneiro, G.: Bayesian generative active deep learning. In: International Conference on Machine Learning, pp. 6295–6304. PMLR (2019)

    Google Scholar 

  15. Wang, D., Shang, Y.: A new active labeling method for deep learning. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 112–119. IEEE (2014)

    Google Scholar 

  16. Wu, X., Chen, C., Zhong, M., Wang, J., Shi, J.: COVID-al: the diagnosis of COVID-19 with deep active learning. Med. Image Anal. 68, 101913 (2021)

    Article  Google Scholar 

  17. Yoo, D., Kweon, I.: Learning loss for active learning. In: Proceedings of the IEEE/CVF on Computer Vision and Pattern Recognition, pp. 93–102 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quok Zong Chong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chong, Q.Z., Knottenbelt, W.J., Bhatia, K.K. (2021). Evaluation of Active Learning Techniques on Medical Image Classification with Unbalanced Data Distributions. In: Engelhardt, S., et al. Deep Generative Models, and Data Augmentation, Labelling, and Imperfections. DGM4MICCAI DALI 2021 2021. Lecture Notes in Computer Science(), vol 13003. Springer, Cham. https://doi.org/10.1007/978-3-030-88210-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88210-5_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88209-9

  • Online ISBN: 978-3-030-88210-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics