A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging

Chen, Yan; Liu, Qing; Xiang, Yao

doi:10.1007/978-981-99-8558-6_21

Yan Chen¹⁵,
Qing Liu¹⁵ &
Yao Xiang^16,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14437))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

687 Accesses

Abstract

Pneumonia is a prevalent disease, and some pneumonia patients develop parapneumonic effusion. Among patients with pneumonia, the mortality rate of patients with parapneumonic effusion is higher than that of patients without parapneumonic effusions. Parapneumonic effusion is classified as the uncomplicated stage or complicated stage. Patients with complicated parapneumonic effusion necessitate pleural drainage, while patients with uncomplicated parapneumonic effusion only require antibiotic treatment. Consequently, staging parapneumonic effusion plays a crucial role in reducing mortality among pneumonia patients. The previous method employs convolutional neural networks to extract features from CT slice and graph neural networks for classifying the slice-level feature sequence in parapneumonic effusion staging. However, we argue that transformers outperform graph neural networks in feature sequence classification. Thus, we integrate transformers into our method. Distinguishing between uncomplicated and complicated parapneumonic effusions based solely on CT slices is challenging due to their similar appearance. Therefore, we incorporate additional information in the form of a prompt to aid in the staging process. This allows us to propose a promptable model for parapneumonic effusion staging that incorporates multimodal information. Our method surpasses previous approaches, achieving impressive results with an F1-score of 92.72 and an AUC of 96.67.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arnab, A., et al.: ViViT: a video vision transformer. In: ICCV, pp. 6836–6846 (2021)
Google Scholar
Carreira, J., Zisserman, A.: Quo Vadis, action recognition? A new model and the kinetics dataset. In: CVPR, pp. 6299–6308 (2017)
Google Scholar
Chang, K., et al.: Residual convolutional neural network for the determination of IDH status in low- and high-grade Gliomas from MR imaging. Clin. Cancer Res. 24(5), 1073–1081 (2018)
Article Google Scholar
Chen, Z., et al.: Instance importance-aware graph convolutional network for 3D medical diagnosis. Media 78, 102421 (2022)
Google Scholar
Chikontwe, P., et al.: Dual attention multiple instance learning with unsupervised complementary loss for COVID-19 screening. Media 72, 102105 (2021)
Google Scholar
Choudhury, S.R.: Pediatric Surgery. Springer, Cham (2018). https://doi.org/10.1007/978-981-10-6304-6
Book Google Scholar
Corcoran, J.P., Wrightson, J.M., Belcher, E., DeCamp, M.M., Feller-Kopman, D., Rahman, N.M.: Pleural infection: past, present, and future directions. Lancet Respir. Med. 3(7), 563–577 (2015)
Article Google Scholar
Davies, H.E., Davies, R.J., Davies, C.W.: Management of pleural infection in adults: British thoracic society pleural disease guideline 2010. Thorax 65(Suppl 2), ii41–ii53 (2010)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16 $\times $ 16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Hamm, H., et al.: Parapneumonic effusion and empyema. Eur. Respir. J. 10(5), 1150–1156 (1997)
Article Google Scholar
Han, Z., et al.: Accurate screening of COVID-19 using attention-based deep 3D multiple instance learning. TMI 39(8), 2584–2594 (2020)
Google Scholar
Hao, H., et al.: Angle-closure assessment in anterior segment OCT images via deep learning. Media 69, 101956 (2021)
Google Scholar
Hao, J., et al.: Uncertainty-guided graph attention network for parapneumonic effusion diagnosis. Media 75, 102217 (2022)
Google Scholar
He, K., et al.: Synergistic learning of lung lobe segmentation and hierarchical multi-instance classification for automated severity assessment of COVID-19 in CT images. PR 113, 107828 (2021)
Google Scholar
Ilse, M., et al.: Attention-based deep multiple instance learning. In: ICML, pp. 2127–2136 (2018)
Google Scholar
Kipf, T.N., et al.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Google Scholar
Kirillov, A., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., et al.: Video swin transformer. In: CVPR, pp. 3202–3211 (2022)
Google Scholar
Neimark, D., et al.: Video transformer network. In: ICCV, pp. 3163–3172 (2021)
Google Scholar
Porcel, J.M., et al.: Computed tomography scoring system for discriminating between parapneumonic effusions eventually drained and those cured only with antibiotics. Respirology 22(6), 1199–1204 (2017)
Article Google Scholar
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: ICCV, pp. 5533–5541 (2017)
Google Scholar
Roth, H.R., et al.: Improving computer-aided detection using convolutional neural networks and random view aggregation. TMI 35(5), 1170–1181 (2016)
Google Scholar
Stark, D., Federle, M., Goodman, P., Podrasky, A., Webb, W.: Differentiating lung abscess and empyema: radiography and computed tomography. Am. J. Roentgenol. 141(1), 163–167 (1983)
Article Google Scholar
Svigals, P.Z., Chopra, A., Ravenel, J.G., Nietert, P.J., Huggins, J.T.: The accuracy of pleural ultrasonography in diagnosing complicated parapneumonic pleural effusions. Thorax 72(1), 94–95 (2017)
Article Google Scholar
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: CVPR, pp. 6450–6459 (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Google Scholar
Veličković, P., et al.: Graph attention networks. In: ICLR (2018)
Google Scholar
Wang, S., et al.: A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19). Eur. Radiol. 31(8), 6096–6104 (2021)
Article Google Scholar
Wang, X., et al.: A weakly-supervised framework for COVID-19 classification and lesion localization from chest CT. TMI 39(8), 2615–2625 (2020)
Google Scholar
Xie, S., Sun, C., Huang, J., Tu, Z., Murphy, K.: Rethinking spatiotemporal feature learning: speed-accuracy trade-offs in video classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 318–335. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_19
Chapter Google Scholar
Xu, X., et al.: A deep learning system to screen novel coronavirus disease 2019 pneumonia. Engineering 6(10), 1122–1129 (2020)
Article Google Scholar
Zhang, K., et al.: Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography. Cell 181(6), 1423–1433.e11 (2020)
Google Scholar
Zhou, Q., et al.: Grading of hepatocellular carcinoma using 3D SE-DenseNet in dynamic enhanced MR images. Comput. Biol. Med. 107, 47–57 (2019)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Ningbo Clinical Research Center for Medical Imaging (No. 2021L003) and Funded by the Project of NINGBO Leading Medical &Health Discipline, Project Number: No. 2022-S02.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Central South University, Changsha, China
Yan Chen & Qing Liu
Ningbo No. 2 Hospital, Ningbo, China
Yao Xiang
Electrical Engineering and Computer Science, Ningbo University, Ningbo, China
Yao Xiang

Authors

Yan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yao Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yao Xiang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Liu, Q., Xiang, Y. (2024). A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14437. Springer, Singapore. https://doi.org/10.1007/978-981-99-8558-6_21

Download citation

DOI: https://doi.org/10.1007/978-981-99-8558-6_21
Published: 26 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8557-9
Online ISBN: 978-981-99-8558-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging