Skip to main content

A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14437))

Included in the following conference series:

  • 271 Accesses

Abstract

Pneumonia is a prevalent disease, and some pneumonia patients develop parapneumonic effusion. Among patients with pneumonia, the mortality rate of patients with parapneumonic effusion is higher than that of patients without parapneumonic effusions. Parapneumonic effusion is classified as the uncomplicated stage or complicated stage. Patients with complicated parapneumonic effusion necessitate pleural drainage, while patients with uncomplicated parapneumonic effusion only require antibiotic treatment. Consequently, staging parapneumonic effusion plays a crucial role in reducing mortality among pneumonia patients. The previous method employs convolutional neural networks to extract features from CT slice and graph neural networks for classifying the slice-level feature sequence in parapneumonic effusion staging. However, we argue that transformers outperform graph neural networks in feature sequence classification. Thus, we integrate transformers into our method. Distinguishing between uncomplicated and complicated parapneumonic effusions based solely on CT slices is challenging due to their similar appearance. Therefore, we incorporate additional information in the form of a prompt to aid in the staging process. This allows us to propose a promptable model for parapneumonic effusion staging that incorporates multimodal information. Our method surpasses previous approaches, achieving impressive results with an F1-score of 92.72 and an AUC of 96.67.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arnab, A., et al.: ViViT: a video vision transformer. In: ICCV, pp. 6836–6846 (2021)

    Google Scholar 

  2. Carreira, J., Zisserman, A.: Quo Vadis, action recognition? A new model and the kinetics dataset. In: CVPR, pp. 6299–6308 (2017)

    Google Scholar 

  3. Chang, K., et al.: Residual convolutional neural network for the determination of IDH status in low- and high-grade Gliomas from MR imaging. Clin. Cancer Res. 24(5), 1073–1081 (2018)

    Article  Google Scholar 

  4. Chen, Z., et al.: Instance importance-aware graph convolutional network for 3D medical diagnosis. Media 78, 102421 (2022)

    Google Scholar 

  5. Chikontwe, P., et al.: Dual attention multiple instance learning with unsupervised complementary loss for COVID-19 screening. Media 72, 102105 (2021)

    Google Scholar 

  6. Choudhury, S.R.: Pediatric Surgery. Springer, Cham (2018). https://doi.org/10.1007/978-981-10-6304-6

    Book  Google Scholar 

  7. Corcoran, J.P., Wrightson, J.M., Belcher, E., DeCamp, M.M., Feller-Kopman, D., Rahman, N.M.: Pleural infection: past, present, and future directions. Lancet Respir. Med. 3(7), 563–577 (2015)

    Article  Google Scholar 

  8. Davies, H.E., Davies, R.J., Davies, C.W.: Management of pleural infection in adults: British thoracic society pleural disease guideline 2010. Thorax 65(Suppl 2), ii41–ii53 (2010)

    Google Scholar 

  9. Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  10. Hamm, H., et al.: Parapneumonic effusion and empyema. Eur. Respir. J. 10(5), 1150–1156 (1997)

    Article  Google Scholar 

  11. Han, Z., et al.: Accurate screening of COVID-19 using attention-based deep 3D multiple instance learning. TMI 39(8), 2584–2594 (2020)

    Google Scholar 

  12. Hao, H., et al.: Angle-closure assessment in anterior segment OCT images via deep learning. Media 69, 101956 (2021)

    Google Scholar 

  13. Hao, J., et al.: Uncertainty-guided graph attention network for parapneumonic effusion diagnosis. Media 75, 102217 (2022)

    Google Scholar 

  14. He, K., et al.: Synergistic learning of lung lobe segmentation and hierarchical multi-instance classification for automated severity assessment of COVID-19 in CT images. PR 113, 107828 (2021)

    Google Scholar 

  15. Ilse, M., et al.: Attention-based deep multiple instance learning. In: ICML, pp. 2127–2136 (2018)

    Google Scholar 

  16. Kipf, T.N., et al.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

    Google Scholar 

  17. Kirillov, A., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)

  18. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)

    Google Scholar 

  19. Liu, Z., et al.: Video swin transformer. In: CVPR, pp. 3202–3211 (2022)

    Google Scholar 

  20. Neimark, D., et al.: Video transformer network. In: ICCV, pp. 3163–3172 (2021)

    Google Scholar 

  21. Porcel, J.M., et al.: Computed tomography scoring system for discriminating between parapneumonic effusions eventually drained and those cured only with antibiotics. Respirology 22(6), 1199–1204 (2017)

    Article  Google Scholar 

  22. Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: ICCV, pp. 5533–5541 (2017)

    Google Scholar 

  23. Roth, H.R., et al.: Improving computer-aided detection using convolutional neural networks and random view aggregation. TMI 35(5), 1170–1181 (2016)

    Google Scholar 

  24. Stark, D., Federle, M., Goodman, P., Podrasky, A., Webb, W.: Differentiating lung abscess and empyema: radiography and computed tomography. Am. J. Roentgenol. 141(1), 163–167 (1983)

    Article  Google Scholar 

  25. Svigals, P.Z., Chopra, A., Ravenel, J.G., Nietert, P.J., Huggins, J.T.: The accuracy of pleural ultrasonography in diagnosing complicated parapneumonic pleural effusions. Thorax 72(1), 94–95 (2017)

    Article  Google Scholar 

  26. Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: CVPR, pp. 6450–6459 (2018)

    Google Scholar 

  27. Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)

    Google Scholar 

  28. Veličković, P., et al.: Graph attention networks. In: ICLR (2018)

    Google Scholar 

  29. Wang, S., et al.: A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19). Eur. Radiol. 31(8), 6096–6104 (2021)

    Article  Google Scholar 

  30. Wang, X., et al.: A weakly-supervised framework for COVID-19 classification and lesion localization from chest CT. TMI 39(8), 2615–2625 (2020)

    Google Scholar 

  31. Xie, S., Sun, C., Huang, J., Tu, Z., Murphy, K.: Rethinking spatiotemporal feature learning: speed-accuracy trade-offs in video classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 318–335. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_19

    Chapter  Google Scholar 

  32. Xu, X., et al.: A deep learning system to screen novel coronavirus disease 2019 pneumonia. Engineering 6(10), 1122–1129 (2020)

    Article  Google Scholar 

  33. Zhang, K., et al.: Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography. Cell 181(6), 1423–1433.e11 (2020)

    Google Scholar 

  34. Zhou, Q., et al.: Grading of hepatocellular carcinoma using 3D SE-DenseNet in dynamic enhanced MR images. Comput. Biol. Med. 107, 47–57 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Ningbo Clinical Research Center for Medical Imaging (No. 2021L003) and Funded by the Project of NINGBO Leading Medical &Health Discipline, Project Number: No. 2022-S02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yao Xiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Y., Liu, Q., Xiang, Y. (2024). A Multi-modality Driven Promptable Transformer for Automated Parapneumonic Effusion Staging. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14437. Springer, Singapore. https://doi.org/10.1007/978-981-99-8558-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8558-6_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8557-9

  • Online ISBN: 978-981-99-8558-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics