Automated abnormality detection in lower extremity radiographs using deep learning

Varma, Maya; Lu, Mandy; Gardner, Rachel; Dunnmon, Jared; Khandwala, Nishith; Rajpurkar, Pranav; Long, Jin; Beaulieu, Christopher; Shpanskaya, Katie; Fei-Fei, Li; Lungren, Matthew P.; Patel, Bhavik N.

doi:10.1038/s42256-019-0126-0

Article
Published: 09 December 2019

Automated abnormality detection in lower extremity radiographs using deep learning

Maya Varma¹,
Mandy Lu¹^na1,
Rachel Gardner¹^na1,
Jared Dunnmon¹,
Nishith Khandwala¹,
Pranav Rajpurkar¹,
Jin Long²,
Christopher Beaulieu³,
Katie Shpanskaya³,
Li Fei-Fei¹,
Matthew P. Lungren³^na1 &
…
Bhavik N. Patel ORCID: orcid.org/0000-0001-5157-9903³^na1

Nature Machine Intelligence volume 1, pages 578–583 (2019)Cite this article

1524 Accesses
45 Citations
28 Altmetric
Metrics details

Subjects

Abstract

Musculoskeletal disorders are a major healthcare challenge around the world. We investigate the utility of convolutional neural networks (CNNs) in performing generalized abnormality detection on lower extremity radiographs. We also explore the effect of pretraining, dataset size and model architecture on model performance to provide recommendations for future deep learning analyses on extremity radiographs, especially when access to large datasets is challenging. We collected a large dataset of 93,455 lower extremity radiographs of multiple body parts, with each exam labelled as normal or abnormal. A 161-layer densely connected, pretrained CNN achieved an AUC-ROC of 0.880 (sensitivity = 0.714, specificity = 0.961) on this abnormality classification task. Our findings show that a single CNN model can be effectively utilized for the identification of diverse abnormalities in highly variable radiographs of multiple body parts, a result that holds potential for improving patient triage and assisting with diagnostics in resource-limited settings.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Categorization of patients in training, validation and test sets.**

**Fig. 3: Grad-CAM visualizations for abnormal lower extremities.**

A calibrated deep learning ensemble for abnormality detection in musculoskeletal radiographs

Article Open access 27 April 2021

Automated abnormality classification of chest radiographs using deep convolutional neural networks

Article Open access 14 May 2020

Radiologists can visually predict mortality risk based on the gestalt of chest radiographs comparable to a deep learning network

Article Open access 01 October 2021

Data availability

We are releasing our de-identified test set as part of this manuscript. This dataset includes radiographs from 182 patients and demonstrates class balance across normal and abnormal labels as well as the four types of lower extremity (foot, hip, knee and ankle). In addition, two board-certified radiologists manually refined all labels, which guarantees a high level of accuracy. The dataset is available at https://aimi.stanford.edu/lera-lower-extremity-radiographs-2.

Code availability

Our deep learning training framework is available at: https://github.com/maya124/MSK-LE.

References

Yelin, E., Weinstein, S. & King, T. The burden of musculoskeletal diseases in the United States. Semin. Arthritis Rheum. 46, 259–60. (2016).
Article Google Scholar
Amin, S., Achenbach, S. J., Atkinson, E. J., Khosla, S. & Melton, L. J. III Trends in fracture incidence: a population-based study over 20 years. J. Bone Miner. Res. 29, 581–589 (2014).
Article Google Scholar
Gyftopoulos, S. et al. Changing musculoskeletal extremity imaging utilization from 1994 through 2013: a Medicare beneficiary perspective. Am. J. Roentgenol. 209, 1103–1109 (2017).
Article Google Scholar
Lee, C. S., Nagy, P. G., Weaver, S. J. & Newman-Toker, D. E. Cognitive and system factors contributing to diagnostic errors in radiology. Am. J. Roentgenol. 201, 611–617 (2013).
Article Google Scholar
Bhargavan, M., Kaye, A. H., Forman, H. P. & Sunshine, J. H. Workload of radiologists in United States in 2006–2007 and trends since 1991–1992. Radiology 252, 458–467 (2009).
Article Google Scholar
Rajpurkar, P. et al. MURA: large dataset for abnormality detection in musculoskeletal radiographs. Preprint at https://arxiv.org/abs/1712.06957 (2017).
Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).
Article Google Scholar
Thian, Y. L. et al. Convolutional neural networks for automated fracture detection and localization on wrist radiographs. Radiology: Artificial Intelligence 1, e180001 (2019).
Google Scholar
Huh, M., Agrawal, P. & Efros, A. A. What makes ImageNet good for transfer learning? Preprint at https://arxiv.org/abs/1608.08614 (2016).
Rajpurkar, P. et al. CheXNet: radiologist-level pneumonia detection on chest X-rays with deep learning. Preprint at https://arxiv.org/abs/1711.05225 (2017).
Larson, D. B. et al. Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology. 287, 313–22. (2018).
Article Google Scholar
Antony, J., McGuinness, K., O’Connor, N. E. & Moran K. Quantifying radiographic knee osteoarthritis severity using deep convolutional neural networks. In Proceedings of the International Conference on Pattern Recognition 1195–1200 (2017).
Bi, L., Kim, J., Kumar, A. & Feng, D. Automatic liver lesion detection using cascaded deep residual networks. Preprint at https://arxiv.org/abs/1704.02703 (2017).
Zhang, R. et al. Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE J. Biomed. Health Inform. 21, 41–47 (2017).
Article Google Scholar
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Article Google Scholar
Greenspan, H. et al. Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans. Med. Imaging 35, 1153–1159 (2016).
Article Google Scholar
Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122–1131 (2018).
Article Google Scholar
Yan, C. et al. Weakly supervised deep learning for thoracic disease classification and localization on chest X-rays. Preprint at https://arxiv.org/abs/1807.06067 (2018).
Bar, Y. et al. Chest pathology detection using deep learning with non-medical training. In Proceedings of the International Symposium on Biomedical Imaging 294–297 (2015).
Olczak, J. et al. Artificial intelligence for analyzing orthopedic trauma radiographs: deep learning algorithms—are they on par with humans for diagnosing fractures? Acta Orthop. 88, 581–586 (2017).
Article Google Scholar
Lindsey, R. et al. Deep neural network improves fracture detection by clinicians. Proc. Natl Acad. Sci. USA 115, 11591–11596 (2018).
Article MathSciNet Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2921–2929 (IEEE, 2016).
Chartrand, G. et al. Deep learning: a primer for radiologists. Radiographics 37, 2113–31. (2017).
Article Google Scholar
Yosinski, J., Clune, J., Bengio, Y. & Lipson H. How transferable are features in deep neural networks? In Proceedings of the 27th International Neural Information Processing Systems Conference 3320–3328 (MIT Press, 2014).
Dunnmon, J. A. et al. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology. 290, 537–544 (2019).
Article Google Scholar
Gale, W., Oakden-Rayner, L., Carneiro, G., Bradley, A. P. & Palmer, L. J. Detecting hip fractures with radiologist-level performance using deep neural networks. Preprint at https://arxiv.org/abs/1711.06504 (2017).
Krupinski, E. A., Berbaum, K. S., Caldwell, R. T., Schartz, K. M. & Kim, J. Long radiology workdays reduce detection and accommodation accuracy. J. Am. Coll. Radiol. 7, 698–704 (2010).
Article Google Scholar
Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015).
Article MathSciNet Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 4700–4708 (IEEE, 2017).
He, K., Zhang, X., Ren, S. & Sun J. Delving deep into rectifiers. Surpassing human-level performance on ImageNet classification. In Proceedings of the International Conference on Computer Vision 1026–1034 (2015).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 44, 837–845 (1988).
Article Google Scholar

Download references

Acknowledgements

This study was supported by the Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI). The research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under award no. R01LM012966 and Stanford Child Health Research Institute (Stanford NIH-NCATS-CTSA grant #UL1 TR001085). This research used data or services provided by STARR (STAnford medicine Research data Repository) a clinical data warehouse made possible by the Stanford School of Medicine Research Office.

Author information

These authors contributed equally: Mandy Lu, Rachel Gardner, Matthew P. Lungren, Bhavik N. Patel.

Authors and Affiliations

Department of Computer Science, Stanford University, Stanford, CA, USA
Maya Varma, Mandy Lu, Rachel Gardner, Jared Dunnmon, Nishith Khandwala, Pranav Rajpurkar & Li Fei-Fei
Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
Jin Long
Department of Radiology, Stanford University School of Medicine, Stanford, CA, USA
Christopher Beaulieu, Katie Shpanskaya, Matthew P. Lungren & Bhavik N. Patel

Authors

Maya Varma
View author publications
You can also search for this author in PubMed Google Scholar
Mandy Lu
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Gardner
View author publications
You can also search for this author in PubMed Google Scholar
Jared Dunnmon
View author publications
You can also search for this author in PubMed Google Scholar
Nishith Khandwala
View author publications
You can also search for this author in PubMed Google Scholar
Pranav Rajpurkar
View author publications
You can also search for this author in PubMed Google Scholar
Jin Long
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Beaulieu
View author publications
You can also search for this author in PubMed Google Scholar
Katie Shpanskaya
View author publications
You can also search for this author in PubMed Google Scholar
Li Fei-Fei
View author publications
You can also search for this author in PubMed Google Scholar
Matthew P. Lungren
View author publications
You can also search for this author in PubMed Google Scholar
Bhavik N. Patel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed extensively to this work. M.V., M.L. and R.G. designed the methodology and algorithms, implemented models, analysed results and wrote the manuscript. B.N.P. and M.P.L. oversaw the entire project and helped with study design, methodology development and manuscript writing. N.K. and P.R. provided technical advice and manuscript feedback. J.D. and J.L. contributed to statistical analyses and writing the manuscript. C.B. and K.S. assisted with data collection and labelling. L.F.-F. provided resources and advice.

Corresponding author

Correspondence to Bhavik N. Patel.

Ethics declarations

Competing interests

There was no industry support or other funding for this work. There are no conflicts of interests that pertain specifically to this work. However, some of the authors are consultants for medical industry. M.P.L. is supported by the National Library of Medicine of the NIH (R01LM012966). B.N.P. has grant support from GE. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or GE. M.P.L.’s activities not related to this Article include positions as shareholder and advisory board member for Segmed Inc., Nines.ai and Bunker Hill. M.V., R.G., M.L., N.K., P.R., J.L. and K.S. are not employees or consultants for industry and had control of the data and the analysis.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary tables and figures.

Reporting summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Varma, M., Lu, M., Gardner, R. et al. Automated abnormality detection in lower extremity radiographs using deep learning. Nat Mach Intell 1, 578–583 (2019). https://doi.org/10.1038/s42256-019-0126-0

Download citation

Received: 13 March 2019
Accepted: 01 November 2019
Published: 09 December 2019
Issue Date: December 2019
DOI: https://doi.org/10.1038/s42256-019-0126-0

This article is cited by

A deep learning approach for projection and body-side classification in musculoskeletal radiographs
- Anna Fink
- Hien Tran
- Maximilian F. Russe
European Radiology Experimental (2024)
Automated diagnosis of flatfoot using cascaded convolutional neural network for angle measurements in weight-bearing lateral radiographs
- Seung Min Ryu
- Keewon Shin
- Namkug Kim
European Radiology (2023)
Exploring asymmetric pruning evolution for detecting anomalies in health monitoring time series
- Fang Yu
- Shijun Li
- Wei Yu
Soft Computing (2023)
Transfer learning-based ensemble convolutional neural network for accelerated diagnosis of foot fractures
- Taekyeong Kim
- Tae Sik Goh
- Im Doo Jung
Physical and Engineering Sciences in Medicine (2023)
Generation of microbial colonies dataset with deep learning style transfer
- Jarosław Pawłowski
- Sylwia Majchrowska
- Tomasz Golan
Scientific Reports (2022)