Skip to main content

Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13435))

Abstract

Self-supervised pre-training on unlabeled images has shown promising results in the medical domain. Recently, methods using text-supervision from companion text like radiological reports improved upon these results even further. However, most works in the medical domain focus on image classification downstream tasks and do not study more localized tasks like semantic segmentation or object detection. We therefore propose a novel evaluation framework consisting of 18 localized tasks, including semantic segmentation and object detection, on five public chest radiography datasets. Using our proposed evaluation framework, we study the effectiveness of existing text-supervised methods and compare them with image-only self-supervised methods and transfer from classification in more than 1200 evaluation runs. Our experiments show that text-supervised methods outperform all other methods on 13 out of 18 tasks making them the preferred method. In conclusion, image-only contrastive methods provide a strong baseline if no reports are available while transfer from classification, even in-domain, does not perform well in pre-training for localized tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that we do only contribute the selection of these datasets and the definition of tasks on them while we do not contribute any new datasets or ground truth labels.

  2. 2.

    Note that NIH CXR is a small subset of the Chestx-ray8 [36] dataset that contains detection targets.

References

  1. Bardes, A., Ponce, J., LeCun, Y.: VICReg: variance-invariance-covariance regularization for self-supervised learning. arXiv:2105.04906 (2021)

  2. Caron, M., Touvron, H., Misra, I., et al.: Emerging properties in self-supervised vision transformers. In: ICCV, pp. 9630–9640 (2021). https://doi.org/10.1109/ICCV48922.2021.00951

  3. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: ICML, pp. 1597–1607 (2020)

    Google Scholar 

  4. Chen, X., He, K.: Exploring simple Siamese representation learning. arXiv:2011.10566 (2020)

  5. Desai, K., Johnson, J.: VirTex: learning visual representations from textual annotations. arXiv:2006.06666 (2020)

  6. Desai, S., Baghal, A., Wongsurawat, T., et al.: Data from chest imaging with clinical and genomic correlates representing a rural COVID-19 positive population. Cancer Imaging Arch. (2020). https://doi.org/10.7937/tcia.2020.py71-5978

    Article  Google Scholar 

  7. Ermolov, A., Siarohin, A., Sangineto, E., Sebe, N.: Whitening for self-supervised representation learning. arXiv:2007.06346 (2020)

  8. Gazda, M., Gazda, J., Plavka, J., Drotar, P.: Self-supervised deep convolutional neural network for chest X-ray classification. arXiv:2103.03055 (2021)

  9. Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., et al.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23), 215–220 (2000)

    Article  Google Scholar 

  10. Grill, J.B., Strub, F., Altché, F., et al.: Bootstrap your own latent - a new approach to self-supervised learning. In: NeurIPS, pp. 21271–21284 (2020)

    Google Scholar 

  11. He, K., Fan, H., Wu, Y., et al.: Momentum contrast for unsupervised visual representation learning. In: CVPR, pp. 9726–9735 (2020). https://doi.org/10.1109/CVPR42600.2020.00975

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  13. Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., et al.: Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670 (2019)

  14. Hénaff, O.J., Srinivas, A., et al.: Data-efficient image recognition with contrastive predictive coding. In: ICML, pp. 4182–4192 (2020)

    Google Scholar 

  15. Irvin, J., Rajpurkar, P., Ko, M., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI, pp. 590–597 (2019)

    Google Scholar 

  16. JF-Healthcare: object-CXR - automatic detection of foreign objects on chest X-rays. MIDL (2020). https://jfhealthcare.github.io/object-CXR/

  17. Jia, C., Yang, Y., Xia, Y., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: ICML, pp. 4904–4916 (2021)

    Google Scholar 

  18. Johnson, A., Lungren, M., Peng, Y., et al.: MIMIC-CXR-JPG - chest radiographs with structured labels (version 2.0.0). PhysioNet (2019). https://doi.org/10.13026/8360-t248

  19. Johnson, A., Pollard, T., Berkowitz, S., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(317), 1–8 (2019). https://doi.org/10.1038/s41597-019-0322-0

    Article  Google Scholar 

  20. Johnson, A., Pollard, T., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR database (version 2.0.0). PhysioNet (2019). https://doi.org/10.13026/C2JT1Q

  21. Li, J., Zhou, P., Xiong, C., Hoi, S.C.H.: Prototypical contrastive learning of unsupervised representations. arXiv:2005.04966 (2021)

  22. Liao, R., et al.: Multimodal representation learning via maximization of local mutual information. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12902, pp. 273–283. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87196-3_26

    Chapter  Google Scholar 

  23. Liu, Z., Stent, S., Li, J., et al.: LocTex: learning data-efficient visual representations from localized textual supervision. arXiv:2108.11950 (2021)

  24. Misra, I., van der Maaten, L.: Self-supervised learning of pretext-invariant representations. arXiv:1912.01991 (2019)

  25. van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv:1807.03748 (2019)

  26. Radford, A., Kim, J.W., Hallacy, C., et al.: Learning transferable visual models from natural language supervision. arXiv:2103.00020 (2021)

  27. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 (2018)

  28. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  29. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  30. Sariyildiz, M.B., Perez, J., Larlus, D.: Learning visual representations with caption annotations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 153–170. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_10

    Chapter  Google Scholar 

  31. Shih, G., Wu, C.C., Halabi, S.S., et al.: Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia. Radiol. Artif. Intell. 1 (2019). https://doi.org/10.1148/ryai.2019180041

  32. Society for Imaging Informatics in Medicine: SIIM-ACR pneumothorax segmentation (2019). https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation

  33. Sowrirajan, H., Yang, J., Ng, A.Y., Rajpurkar, P.: MoCo-CXR: MoCo pretraining improves representation and transferability of chest X-ray models. arXiv:2010.05352 (2021)

  34. Sriram, A., Muckley, M., Sinha, K., et al.: COVID-19 prognosis via self-supervised representation learning and multi-image prediction. arXiv:2101.04909 (2021)

  35. Tang, H., Sun, N., Li, Y.: Segmentation model of the opacity regions in the chest X-rays of the COVID-19 patients in the us rural areas and the application to the disease severity. medRxiv (2020). https://doi.org/10.1101/2020.10.19.20215483

  36. Wang, X., Peng, Y., Lu, L., et al.: ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR, pp. 3462–3471 (2017). https://doi.org/10.1109/CVPR.2017.369

  37. Wu, Z., Xiong, Y., Yu, S., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR, pp. 3733–3742 (2018). https://doi.org/10.1109/CVPR.2018.00393

  38. Xie, Z., Lin, Y., Zhang, Z., et al.: Propagate yourself: exploring pixel-level consistency for unsupervised visual representation learning. arXiv:2011.10043 (2020)

  39. Zbontar, J., Jing, L., Misra, I., et al.: Barlow twins: self-supervised learning via redundancy reduction. arXiv:2103.03230 (2021)

  40. Zhang, Y., Jiang, H., Miura, Y., et al.: Contrastive learning of medical visual representations from paired images and text. arXiv:2010.00747 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Müller .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 170 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Müller, P., Kaissis, G., Zou, C., Rueckert, D. (2022). Radiological Reports Improve Pre-training for Localized Imaging Tasks on Chest X-Rays. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham. https://doi.org/10.1007/978-3-031-16443-9_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16443-9_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16442-2

  • Online ISBN: 978-3-031-16443-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics