Abstract
Thyroid nodules occur in up to 68% of people, 95% of which are benign. Of the 5% of malignant nodules, many would not result in symptoms or death, yet 600,000 FNAs are still performed annually, with a PPV of 5–7% (up to 30%). Artificial intelligence (AI) systems have the capacity to improve diagnostic accuracy and workflow efficiency when integrated into clinical decision pathways. Previous studies have evaluated AI systems against physicians, whereas we aim to compare the benefits of incorporating AI into their final diagnostic decision. This work analyzed the potential for artificial intelligence (AI)-based decision support systems to improve physician accuracy, variability, and efficiency. The decision support system (DSS) assessed was Koios DS, which provides automated sonographic nodule descriptor predictions and a direct cancer risk assessment aligned to ACR TI-RADS. The study was conducted retrospectively between (08/2020) and (10/2020). The set of cases used included 650 patients (21% male, 79% female) of age 53 ± 15. Fifteen physicians assessed each of the cases in the set, both unassisted and aided by the DSS. The order of the reading condition was randomized, and reading blocks were separated by a period of 4 weeks. The system’s impact on reader accuracy was measured by comparing the area under the ROC curve (AUC), sensitivity, and specificity of readers with and without the DSS with FNA as ground truth. The impact on reader variability was evaluated using Pearson’s correlation coefficient. The impact on efficiency was determined by comparing the average time per read. There was a statistically significant increase in average AUC of 0.083 [0.066, 0.099] and an increase in sensitivity and specificity of 8.4% [5.4%, 11.3%] and 14% [12.5%, 15.5%], respectively, when aided by Koios DS. The average time per case decreased by 23.6% (p = 0.00017), and the observed Pearson’s correlation coefficient increased from r = 0.622 to r = 0.876 when aided by Koios DS. These results indicate that providing physicians with automated clinical decision support significantly improved diagnostic accuracy, as measured by AUC, sensitivity, and specificity, and reduced inter-reader variability and interpretation times.
Similar content being viewed by others
Data Availability
There is a section with how to evaluate other models against our data in the data section.
References
Li, M., Maso, L. D., & Vaccarella, S. (2020). Global trends in thyroid cancer incidence and the impact of overdiagnosis. The Lancet Diabetes & Endocrinology, 8(6), 468–470. https://doi.org/10.1016/S2213-8587(20)30115-7
Roman, B. R., Morris, L. G., & Davies, L. (2017). The thyroid cancer epidemic, 2017 Perspective. Current Opinion in Endocrinology, Diabetes, and Obesity, 24(5), 332–336. https://doi.org/10.1097/MED.0000000000000359
Olson, E., Wintheiser, G., Wolfe, K. M., Droessler, J., & Silberstein, P. T. (2019). Epidemiology of thyroid cancer: a review of the National Cancer Database, 2000–2013. Cureus, 11(2), e4127. https://doi.org/10.7759/cureus.4127
Jegerlehner, S., Bulliard, J.-L., Aujesky, D., Rodondi, N., Germann, S., Konzelmann, I., Chiolero, A., & Group, N. W. (2017). Overdiagnosis and overtreatment of thyroid cancer: a population-based temporal trend study. PLOS ONE, 12(6), e0179387. https://doi.org/10.1371/journal.pone.0179387
Davies, L., & Welch, H. G. (2006). Increasing incidence of thyroid cancer in the United States, 1973–2002. JAMA. 2006;295(18):2164–2167. https://doi.org/10.1001/jama.295.18.2164
Ahn, H. S., Kim, H. J., Kim, K. H., Lee, Y. S., Han, S. J., Kim, Y., Ko, M. J., & Brito, J. P. (2016). Thyroid cancer screening in South Korea increases detection of papillary cancers with no impact on other subtypes or thyroid cancer mortality. Thyroid, 26(11), 1535-1540. https://doi.org/10.1089/thy.2016.0075
Brito, J. P., Morris, J. C., & Montori, V. M. (2013). Thyroid cancer: zealous imaging has increased detection and treatment of low risk tumours. BMJ, 347, f4706. https://doi.org/10.1136/bmj.f4706
Zevallos, J. P., Hartman, C. M., Kramer, J. R., Sturgis, E. M., & Chiao, E. Y. (2015). Increased thyroid cancer incidence corresponds to increased use of thyroid ultrasound and fine-needle aspiration: a study of the Veterans Affairs health care system. Cancer, 121(5), 741–746. https://doi.org/10.1002/cncr.29122
Morris, L. G. T., Sikora, A. G., Tosteson, T. D., & Davies, L. (2013). The increasing incidence of thyroid cancer: the influence of access to care. Thyroid, 23(7), 885–891. https://doi.org/10.1089/thy.2013.0045
Lim, H., Devesa, S. S., Sosa, J. A., Check, D., & Kitahara, C. M. (2017). Trends in thyroid cancer incidence and mortality in the United States, 1974-2013. Jama, 317(13), 1338-1348.
Tessler, F. N., Middleton, W. D., Grant, E. G., Hoang, J. K., Berland, L. L., Teefey, S. A., ... & Stavros, A. T. (2017). ACR thyroid imaging, reporting and data system (TI-RADS): white paper of the ACR TI-RADS committee. Journal of the American college of radiology, 14(5), 587–595.
Middleton WD, Teefey SA, Reading CC, Langer JE, Beland MD, Szabunio MM, Desser TS. Comparison of performance characteristics of American College of Radiology TI-RADS, Korean Society of Thyroid Radiology TIRADS, and American Thyroid Association Guidelines. AJR Am J Roentgenol. 2018 May;210(5):1148-1154. https://doi.org/10.2214/AJR.17.18822. Epub 2018 Apr 9. PMID: 29629797.
Hoang, J. K., Middleton, W. D., Langer, J. E., Schmidt, K., Gillis, L. B., Nair, S. S., Watts, J. A., Snyder, R. W., Khot, R., Rawal, U., & Tessler, F. N. (2021). Comparison of thyroid risk categorization systems and fine needle aspiration recommendations in a multi-institutional thyroid ultrasound registry. Journal of the American College of Radiology, S1546144021006062. https://doi.org/10.1016/j.jacr.2021.07.019
Wildman-Tobriner, B., Buda, M., Hoang, J. K., Middleton, W. D., Thayer, D., Short, R. G., ... & Mazurowski, M. A. (2019). Using artificial intelligence to revise ACR TI-RADS risk stratification of thyroid nodules: diagnostic accuracy and utility. Radiology, 292(1), 112–119.
Stib, M. T., Pan, I., Merck, D., Middleton, W. D., & Beland, M. D. (2020). Thyroid nodule malignancy risk stratification using a convolutional neural network. Ultrasound Quarterly, 36(2), 164-172.
Ha, E. J., Baek, J. H., & Na, D. G. (2017). Risk stratification of thyroid nodules on ultrasonography: current status and perspectives. Thyroid, 27(12), 1463-1468.
Wang, L., Yang, S., Yang, S., Zhao, C., Tian, G., Gao, Y., ... & Lu, Y. (2019). Automatic thyroid nodule recognition and diagnosis in ultrasound imaging with the YOLOv2 neural network. World journal of surgical oncology, 17(1), 1–9.
Jin, Z., Zhu, Y., Zhang, S., Xie, F., Zhang, M., Guo, Y., ... & Luo, Y. (2021). Diagnosis of thyroid cancer using a TI-RADS-based computer-aided diagnosis system: a multicenter retrospective study. Clinical Imaging, 80, 43–49.
Zhu, Y. C., Jin, P. F., Bao, J., Jiang, Q., & Wang, X. (2021). Thyroid ultrasound image classification using a convolutional neural network. Annals of Translational Medicine, 9(20).
Food and Drug Administration. (2021). Koios DS 510k Clearance Letter K212616. FDA 510k Clearance Summary. Retrieved February 22, 2022, from https://www.accessdata.fda.gov/cdrh_docs/pdf21/K212616.pdf
Lasko, T. A., Bhagwat, J. G., Zou, K. H., & Ohno-Machado, L. (2005). The use of receiver operating characteristic curves in biomedical informatics. Journal of biomedical informatics, 38(5), 404-415. https://doi.org/10.1016/j.jbi.2005.02.008
Holmes, D. T., & Buhr, K. A. (2007). Error propagation in calculated ratios. Clinical biochemistry, 40(9-10), 728-734.
Obuchowski NA, Rockette HE. Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: an ANOVA approach with dependent observations. Communications in Statistics-Simulation and Computation 1995; 24(2), 285-308.
Obuchowski NA. Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. Academic Radiology 1995; 2[Suppl 1], S22-S29.
Funding
This study was supported by Koios Medical.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by Western IRB.
Consent to Participate
Informed consent was obtained from all individual participants included in the study.
Competing Interests
Edward G. Grant was financially compensated for his time as a reader in this study.
Iñaki Arguelles was financially compensated for his time as a reader in this study.
Jordi Reverter was financially compensated for his time as a reader in this study.
Michael D. Beland was financially compensated for his time as a reader in this study.
Ross W. Filice was financially compensated for his time as a reader in this study.
Lev Barinov is a scientific and clinical advisor at Koios Medical.
Ajit Jairaj is an employee of Koios Medical.
No other authors have any disclosures.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Barinov, L., Jairaj, A., Middleton, W.D. et al. Improving the Efficacy of ACR TI-RADS Through Deep Learning-Based Descriptor Augmentation. J Digit Imaging 36, 2392–2401 (2023). https://doi.org/10.1007/s10278-023-00884-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-023-00884-z