Abstract
Many medical image classification tasks share a common unbalanced data problem. That is images of the target classes, e.g., certain types of diseases, only appear in a very small portion of the entire dataset. Nowadays, large collections of medical images are readily available. However, it is costly and may not even be feasible for medical experts to manually comb through a huge unlabeled dataset to obtain enough representative examples of the rare classes. In this paper, we propose a new method called Unified LF&SM to recommend most similar images for each class from a large unlabeled dataset for verification by medical experts and inclusion in the seed labeled dataset. Our real data augmentation significantly reduces expensive manual labeling time. In our experiments, Unified LF&SM performed best, selecting a high percentage of relevant images in its recommendation and achieving the best classification accuracy. It is easily extendable to other medical image classification problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tajbakhsh, N., et al.: Convolutional neural networks for medical image analysis: full training or fine tuning? TMI 35(5), 1299–1312 (2016)
Chatfield, K., et al.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint (2014). arXiv:1405.3531
Shin, H.C., et al.: Learning to read chest x-rays: recurrent neural cascade model for automated image annotation. In: CVPR, pp. 2497–2506 (2016)
Zhu, X.: Semi-supervised Learning Literature Survey (2005)
Lu, X., et al.: Enhancing text categorization with semantic-enriched representation and training data augmentation. JAMIA 13(5), 526–535 (2006)
Xu, Z., et al.: Augmenting strong supervision using web data for fine-grained categorization. In: ICCV, pp. 2524–2532 (2015)
Chechik, G., et al.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)
Bishop, C.: Pattern Recognition and Machine Learning, pp. 144–146. Springer, New York (2007)
Zhang, C., et al.: Cable footprint history: spatio-temporal technique for instrument detection in gastrointestinal endoscopic procedures. In: IPCV, pp. 308–314 (2015)
Wang, Y., et al.: Near real-time retroflexion detection in colonoscopy. JBHI 17(1), 143–152 (2013)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409.1556
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint (2016). arXiv:1603.04467
Chollet, F.: Keras. https://github.com/fchollet/keras
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhang, C., Tavanapong, W., Wong, J., de Groen, P.C., Oh, J. (2017). Real Data Augmentation for Medical Image Classification. In: Cardoso, M., et al. Intravascular Imaging and Computer Assisted Stenting, and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis. LABELS STENT CVII 2017 2017 2017. Lecture Notes in Computer Science(), vol 10552. Springer, Cham. https://doi.org/10.1007/978-3-319-67534-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-67534-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67533-6
Online ISBN: 978-3-319-67534-3
eBook Packages: Computer ScienceComputer Science (R0)