Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays

Müller, Philip; Kaissis, Georgios; Zou, Congyu; Rueckert, Daniel

doi:10.1007/978-3-031-16443-9_62

Philip Müller¹²,
Georgios Kaissis^12,13,14,
Congyu Zou¹⁵ &
…
Daniel Rueckert^12,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13435))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

7792 Accesses
7 Citations

Abstract

Self-supervised pre-training on unlabeled images has shown promising results in the medical domain. Recently, methods using text-supervision from companion text like radiological reports improved upon these results even further. However, most works in the medical domain focus on image classification downstream tasks and do not study more localized tasks like semantic segmentation or object detection. We therefore propose a novel evaluation framework consisting of 18 localized tasks, including semantic segmentation and object detection, on five public chest radiography datasets. Using our proposed evaluation framework, we study the effectiveness of existing text-supervised methods and compare them with image-only self-supervised methods and transfer from classification in more than 1200 evaluation runs. Our experiments show that text-supervised methods outperform all other methods on 13 out of 18 tasks making them the preferred method. In conclusion, image-only contrastive methods provide a strong baseline if no reports are available while transfer from classification, even in-domain, does not perform well in pre-training for localized tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that we do only contribute the selection of these datasets and the definition of tasks on them while we do not contribute any new datasets or ground truth labels.
2.
Note that NIH CXR is a small subset of the Chestx-ray8 [36] dataset that contains detection targets.

References

Bardes, A., Ponce, J., LeCun, Y.: VICReg: variance-invariance-covariance regularization for self-supervised learning. arXiv:2105.04906 (2021)
Caron, M., Touvron, H., Misra, I., et al.: Emerging properties in self-supervised vision transformers. In: ICCV, pp. 9630–9640 (2021). https://doi.org/10.1109/ICCV48922.2021.00951
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML, pp. 1597–1607 (2020)
Google Scholar
Chen, X., He, K.: Exploring simple Siamese representation learning. arXiv:2011.10566 (2020)
Desai, K., Johnson, J.: VirTex: learning visual representations from textual annotations. arXiv:2006.06666 (2020)
Desai, S., Baghal, A., Wongsurawat, T., et al.: Data from chest imaging with clinical and genomic correlates representing a rural COVID-19 positive population. Cancer Imaging Arch. (2020). https://doi.org/10.7937/tcia.2020.py71-5978
Article Google Scholar
Ermolov, A., Siarohin, A., Sangineto, E., Sebe, N.: Whitening for self-supervised representation learning. arXiv:2007.06346 (2020)
Gazda, M., Gazda, J., Plavka, J., Drotar, P.: Self-supervised deep convolutional neural network for chest X-ray classification. arXiv:2103.03055 (2021)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., et al.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23), 215–220 (2000)
Article Google Scholar
Grill, J.B., Strub, F., Altché, F., et al.: Bootstrap your own latent - a new approach to self-supervised learning. In: NeurIPS, pp. 21271–21284 (2020)
Google Scholar
He, K., Fan, H., Wu, Y., et al.: Momentum contrast for unsupervised visual representation learning. In: CVPR, pp. 9726–9735 (2020). https://doi.org/10.1109/CVPR42600.2020.00975
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., et al.: Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670 (2019)
Hénaff, O.J., Srinivas, A., et al.: Data-efficient image recognition with contrastive predictive coding. In: ICML, pp. 4182–4192 (2020)
Google Scholar
Irvin, J., Rajpurkar, P., Ko, M., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI, pp. 590–597 (2019)
Google Scholar
JF-Healthcare: object-CXR - automatic detection of foreign objects on chest X-rays. MIDL (2020). https://jfhealthcare.github.io/object-CXR/
Jia, C., Yang, Y., Xia, Y., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: ICML, pp. 4904–4916 (2021)
Google Scholar
Johnson, A., Lungren, M., Peng, Y., et al.: MIMIC-CXR-JPG - chest radiographs with structured labels (version 2.0.0). PhysioNet (2019). https://doi.org/10.13026/8360-t248
Johnson, A., Pollard, T., Berkowitz, S., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(317), 1–8 (2019). https://doi.org/10.1038/s41597-019-0322-0
Article Google Scholar
Johnson, A., Pollard, T., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR database (version 2.0.0). PhysioNet (2019). https://doi.org/10.13026/C2JT1Q
Li, J., Zhou, P., Xiong, C., Hoi, S.C.H.: Prototypical contrastive learning of unsupervised representations. arXiv:2005.04966 (2021)
Liao, R., et al.: Multimodal representation learning via maximization of local mutual information. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12902, pp. 273–283. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87196-3_26
Chapter Google Scholar
Liu, Z., Stent, S., Li, J., et al.: LocTex: learning data-efficient visual representations from localized textual supervision. arXiv:2108.11950 (2021)
Misra, I., van der Maaten, L.: Self-supervised learning of pretext-invariant representations. arXiv:1912.01991 (2019)
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv:1807.03748 (2019)
Radford, A., Kim, J.W., Hallacy, C., et al.: Learning transferable visual models from natural language supervision. arXiv:2103.00020 (2021)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Sariyildiz, M.B., Perez, J., Larlus, D.: Learning visual representations with caption annotations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 153–170. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_10
Chapter Google Scholar
Shih, G., Wu, C.C., Halabi, S.S., et al.: Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia. Radiol. Artif. Intell. 1 (2019). https://doi.org/10.1148/ryai.2019180041
Society for Imaging Informatics in Medicine: SIIM-ACR pneumothorax segmentation (2019). https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation
Sowrirajan, H., Yang, J., Ng, A.Y., Rajpurkar, P.: MoCo-CXR: MoCo pretraining improves representation and transferability of chest X-ray models. arXiv:2010.05352 (2021)
Sriram, A., Muckley, M., Sinha, K., et al.: COVID-19 prognosis via self-supervised representation learning and multi-image prediction. arXiv:2101.04909 (2021)
Tang, H., Sun, N., Li, Y.: Segmentation model of the opacity regions in the chest X-rays of the COVID-19 patients in the us rural areas and the application to the disease severity. medRxiv (2020). https://doi.org/10.1101/2020.10.19.20215483
Wang, X., Peng, Y., Lu, L., et al.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR, pp. 3462–3471 (2017). https://doi.org/10.1109/CVPR.2017.369
Wu, Z., Xiong, Y., Yu, S., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR, pp. 3733–3742 (2018). https://doi.org/10.1109/CVPR.2018.00393
Xie, Z., Lin, Y., Zhang, Z., et al.: Propagate yourself: exploring pixel-level consistency for unsupervised visual representation learning. arXiv:2011.10043 (2020)
Zbontar, J., Jing, L., Misra, I., et al.: Barlow twins: self-supervised learning via redundancy reduction. arXiv:2103.03230 (2021)
Zhang, Y., Jiang, H., Miura, Y., et al.: Contrastive learning of medical visual representations from paired images and text. arXiv:2010.00747 (2020)

Download references

Author information

Authors and Affiliations

Institute of Artificial Intelligence in Medicine, Technical University of Munich, 81675, Munich, Germany
Philip Müller, Georgios Kaissis & Daniel Rueckert
Institute of Radiology, Technical University of Munich, 81675, Munich, Germany
Georgios Kaissis
Department of Computing, Imperial College London, London, SW7 2BX, UK
Georgios Kaissis & Daniel Rueckert
Department for Internal Medicine I, Klinikum Rechts der Isar, Technical University of Munich, 81675, Munich, Germany
Congyu Zou

Authors

Philip Müller
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Kaissis
View author publications
You can also search for this author in PubMed Google Scholar
Congyu Zou
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Rueckert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philip Müller .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 170 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Müller, P., Kaissis, G., Zou, C., Rueckert, D. (2022). Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham. https://doi.org/10.1007/978-3-031-16443-9_62

Download citation

DOI: https://doi.org/10.1007/978-3-031-16443-9_62
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16442-2
Online ISBN: 978-3-031-16443-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays