Abstract
Pneumonia is a prevalent disease, and some pneumonia patients develop parapneumonic effusion. Among patients with pneumonia, the mortality rate of patients with parapneumonic effusion is higher than that of patients without parapneumonic effusions. Parapneumonic effusion is classified as the uncomplicated stage or complicated stage. Patients with complicated parapneumonic effusion necessitate pleural drainage, while patients with uncomplicated parapneumonic effusion only require antibiotic treatment. Consequently, staging parapneumonic effusion plays a crucial role in reducing mortality among pneumonia patients. The previous method employs convolutional neural networks to extract features from CT slice and graph neural networks for classifying the slice-level feature sequence in parapneumonic effusion staging. However, we argue that transformers outperform graph neural networks in feature sequence classification. Thus, we integrate transformers into our method. Distinguishing between uncomplicated and complicated parapneumonic effusions based solely on CT slices is challenging due to their similar appearance. Therefore, we incorporate additional information in the form of a prompt to aid in the staging process. This allows us to propose a promptable model for parapneumonic effusion staging that incorporates multimodal information. Our method surpasses previous approaches, achieving impressive results with an F1-score of 92.72 and an AUC of 96.67.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arnab, A., et al.: ViViT: a video vision transformer. In: ICCV, pp. 6836–6846 (2021)
Carreira, J., Zisserman, A.: Quo Vadis, action recognition? A new model and the kinetics dataset. In: CVPR, pp. 6299–6308 (2017)
Chang, K., et al.: Residual convolutional neural network for the determination of IDH status in low- and high-grade Gliomas from MR imaging. Clin. Cancer Res. 24(5), 1073–1081 (2018)
Chen, Z., et al.: Instance importance-aware graph convolutional network for 3D medical diagnosis. Media 78, 102421 (2022)
Chikontwe, P., et al.: Dual attention multiple instance learning with unsupervised complementary loss for COVID-19 screening. Media 72, 102105 (2021)
Choudhury, S.R.: Pediatric Surgery. Springer, Cham (2018). https://doi.org/10.1007/978-981-10-6304-6
Corcoran, J.P., Wrightson, J.M., Belcher, E., DeCamp, M.M., Feller-Kopman, D., Rahman, N.M.: Pleural infection: past, present, and future directions. Lancet Respir. Med. 3(7), 563–577 (2015)
Davies, H.E., Davies, R.J., Davies, C.W.: Management of pleural infection in adults: British thoracic society pleural disease guideline 2010. Thorax 65(Suppl 2), ii41–ii53 (2010)
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: ICLR (2021)
Hamm, H., et al.: Parapneumonic effusion and empyema. Eur. Respir. J. 10(5), 1150–1156 (1997)
Han, Z., et al.: Accurate screening of COVID-19 using attention-based deep 3D multiple instance learning. TMI 39(8), 2584–2594 (2020)
Hao, H., et al.: Angle-closure assessment in anterior segment OCT images via deep learning. Media 69, 101956 (2021)
Hao, J., et al.: Uncertainty-guided graph attention network for parapneumonic effusion diagnosis. Media 75, 102217 (2022)
He, K., et al.: Synergistic learning of lung lobe segmentation and hierarchical multi-instance classification for automated severity assessment of COVID-19 in CT images. PR 113, 107828 (2021)
Ilse, M., et al.: Attention-based deep multiple instance learning. In: ICML, pp. 2127–2136 (2018)
Kipf, T.N., et al.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Kirillov, A., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)
Liu, Z., et al.: Video swin transformer. In: CVPR, pp. 3202–3211 (2022)
Neimark, D., et al.: Video transformer network. In: ICCV, pp. 3163–3172 (2021)
Porcel, J.M., et al.: Computed tomography scoring system for discriminating between parapneumonic effusions eventually drained and those cured only with antibiotics. Respirology 22(6), 1199–1204 (2017)
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: ICCV, pp. 5533–5541 (2017)
Roth, H.R., et al.: Improving computer-aided detection using convolutional neural networks and random view aggregation. TMI 35(5), 1170–1181 (2016)
Stark, D., Federle, M., Goodman, P., Podrasky, A., Webb, W.: Differentiating lung abscess and empyema: radiography and computed tomography. Am. J. Roentgenol. 141(1), 163–167 (1983)
Svigals, P.Z., Chopra, A., Ravenel, J.G., Nietert, P.J., Huggins, J.T.: The accuracy of pleural ultrasonography in diagnosing complicated parapneumonic pleural effusions. Thorax 72(1), 94–95 (2017)
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: CVPR, pp. 6450–6459 (2018)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Veličković, P., et al.: Graph attention networks. In: ICLR (2018)
Wang, S., et al.: A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19). Eur. Radiol. 31(8), 6096–6104 (2021)
Wang, X., et al.: A weakly-supervised framework for COVID-19 classification and lesion localization from chest CT. TMI 39(8), 2615–2625 (2020)
Xie, S., Sun, C., Huang, J., Tu, Z., Murphy, K.: Rethinking spatiotemporal feature learning: speed-accuracy trade-offs in video classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 318–335. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_19
Xu, X., et al.: A deep learning system to screen novel coronavirus disease 2019 pneumonia. Engineering 6(10), 1122–1129 (2020)
Zhang, K., et al.: Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography. Cell 181(6), 1423–1433.e11 (2020)
Zhou, Q., et al.: Grading of hepatocellular carcinoma using 3D SE-DenseNet in dynamic enhanced MR images. Comput. Biol. Med. 107, 47–57 (2019)
Acknowledgements
This work was supported in part by the Ningbo Clinical Research Center for Medical Imaging (No. 2021L003) and Funded by the Project of NINGBO Leading Medical &Health Discipline, Project Number: No. 2022-S02.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, Y., Liu, Q., Xiang, Y. (2024). A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14437. Springer, Singapore. https://doi.org/10.1007/978-981-99-8558-6_21
Download citation
DOI: https://doi.org/10.1007/978-981-99-8558-6_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8557-9
Online ISBN: 978-981-99-8558-6
eBook Packages: Computer ScienceComputer Science (R0)