Abstract
Learning representative computational models from medical imaging data requires large training data sets. Often, voxel-level annotation is unfeasible for sufficient amounts of data. An alternative to manual annotation, is to use the enormous amount of knowledge encoded in imaging data and corresponding reports generated during clinical routine. Weakly supervised learning approaches can link volume-level labels to image content but suffer from the typical label distributions in medical imaging data where only a small part consists of clinically relevant abnormal structures. In this paper we propose to use a semantic representation of clinical reports as a learning target that is predicted from imaging data by a convolutional neural network. We demonstrate how we can learn accurate voxel-level classifiers based on weak volume-level semantic descriptions on a set of 157 optical coherence tomography (OCT) volumes. We specifically show how semantic information increases classification accuracy for intraretinal cystoid fluid (IRC), subretinal fluid (SRF) and normal retinal tissue, and how the learning algorithm links semantic concepts to image content and geometry.
T. Schlegl—This work has received funding from the European Union FP7 (KHRESMOI FP7-257528, VISCERAL FP7-318068) and the Austrian Federal Ministry of Science, Research and Economy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1), 31–71 (1997)
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: NIPS ’97 Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems 10, pp. 570–576. MIT press, Cambridge (1998)
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep Boltzmann machines. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), vol. 25, pp. 2231–2239 (2012)
Leistner, C., Saffari, A., Santner, J., Bischof, H.: Semi-supervised random forests. In: 12th International Conference on Computer Vision, pp. 506–513, IEEE (2009)
Zhou, Z.H., Zhang, M.L.: Multi-instance multi-label learning with application to scene classification. In: Proceedings of Neural Information Processing Systems (NIPS), vol. 19, pp. 1609–1616 (2007)
Cinbis, R.G., Verbeek, J., Schmid, C.: Multi-fold MIL training for weakly supervised object localization. In: Conference on Computer Vision and Pattern Recognition, IEEE (2014)
Verbeek, J., Triggs, B.: Region classification with markov field aspect models. In: Conference on Computer Vision and Pattern Recognition, pp. 1–8, IEEE (2007)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun. ACM 54(10), 95–103 (2011)
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: Conference on Computer Vision and Pattern Recognition, pp. 3642–3649, IEEE (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), vol. 25, pp. 1097–1105 (2012)
Brosch, T., Tam, R.: Manifold learning of brain MRIs by deep learning. Medical Image Computing and Computer-Assisted Intervention, pp. 633–640 (2013)
Schlegl, T., Ofner, J., Langs, G.: Unsupervised pre-training across image domains improves lung tissue classification. In: Menze, B., Langs, G., Montillo, A., Kelm, M., Müller, H., Zhang, S., Cai, W.T., Metaxas, D. (eds.) MCV 2014. LNCS, vol. 8848, pp. 82–94. Springer, Heidelberg (2014)
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Weakly supervised object recognition with convolutional neural networks. Technical Report HAL-01015140, INRIA (2014)
Pradhan, S., Ward, W., Hacioglu, K., Martin, J., Jurafsky, D.: Shallow semantic parsing using support vector machines. In: Proceedings of HLT/NAACL, pp. 233–240 (2004)
Garvin, M.K., Abrà moff, M.D., Wu, X., Russell, S.R., Burns, T.L., Sonka, M.: Automated 3-D intraretinal layer segmentation of macular spectral-domain optical coherence tomography images. IEEE Trans. Med. Imaging 28(9), 1436–1447 (2009)
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: A CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Schlegl, T., Waldstein, S.M., Vogl, WD., Schmidt-Erfurth, U., Langs, G. (2015). Predicting Semantic Descriptions from Medical Images with Convolutional Neural Networks. In: Ourselin, S., Alexander, D., Westin, CF., Cardoso, M. (eds) Information Processing in Medical Imaging. IPMI 2015. Lecture Notes in Computer Science(), vol 9123. Springer, Cham. https://doi.org/10.1007/978-3-319-19992-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-19992-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19991-7
Online ISBN: 978-3-319-19992-4
eBook Packages: Computer ScienceComputer Science (R0)