Abstract
Multi-label chest X-ray (CXR) image classification aims to perform multiple disease label prediction tasks. This concept is more challenging than single-label classification problems. For instance, convolutional neural networks (CNNs) often struggle to capture the statistical dependencies between labels. Furthermore, the drawback of concatenating CNN and Transformer is the lack of direct interaction and information exchange between the two models. To address these issues, we propose a hybrid deep learning network named CheXNet. It consists of three main parts in the CNN and Transformer branches: Label Embedding and Multi-Scale Pooling module (MEMSP), Inner Branch module (IB), and Information Interaction module (IIM). Firstly, we employ label embedding to automatically capture label dependencies. Secondly, we utilize Multi-Scale Pooling (MSP) to fuse features from different scales and an IB to incorporate local detailed features. Additionally, we introduce a parallel structure that allows interaction between the CNN and the Transformer through the IIM. CNN can provide richer inputs to the Transformer through bottom-up feature extraction, whilst the Transformer can guide feature extraction in the CNN using top-down attention mechanisms. The effectiveness of the proposed method has been validated through qualitative and quantitative experiments on two large-scale multi-label CXR datasets with average AUCs of 82.56% and 76.80% for CXR11 and CXR14, respectively.
This work is supported by the Basic Research and Applied Basic Research Key Project in General Colleges and Universities of Guangdong Province, China (2021ZDZX1032); the Special Project of Guangdong Province, China (2020A1313030021); and the Scientific Research Project of Wuyi University (2018TP023, 2018GR003).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
CXR11: kaggle.com/competitions/ranzcr-clip-catheter-line-classification/data
- 2.
CXR14: nihcc.app.box.com/v/ChestXray-NIHCC
References
Allaouzi, I., Ben Ahmed, M.: A novel approach for multi-label chest x-ray classification of common thorax diseases. IEEE Access 7, 64279–64288 (2019)
Baltruschat, I.M., Nickisch, H., Grass, M., Knopp, T., Saalbach, A.: Comparison of deep learning approaches for multi-label chest x-ray classification. Sci. Rep. 9(1), 1–10 (2019)
Dosovitskiy, A., et al.: An image is worth 16 \(\times \)16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Guan, Q., Huang, Y.: Multi-label chest x-ray image classification via category-wise residual attention learning. Pattern Recogn. Lett. 130(SI), 259–266 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, Seattle, WA, June 27–30, (2016)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Comput. Soc. 42, 2011–2023 (2020)
Lee, Y.W., Huang, S.K., Chang, R.F.: CheXGAT: a disease correlation-aware network for thorax disease diagnosis from chest x-ray images. Artif. Intell. Med. 132, 102382 (2022)
Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021). pp. 9992–10002 2021, eLECTR Network, Oct 11–17 (2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11976–11986 June 2022, new Orleans, LA, JUN 18–24 (2022)
Ma, C., Wang, H., Hoi, S.C.H.: Multi-label thoracic disease image classification with cross-attention networks. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 730–738. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_81
Majkowska, A., et al.: Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology 294(2), 421–431 (2020)
Pesce, E., et al.: Learning to detect chest radiographs containing pulmonary lesions using visual attention networks. Med. Image Anal. 53, 26–38 (2019)
Sahlol, A.T., Abd Elaziz, M., Tariq Jamal, A., Damaševičius, R., Farouk Hassan, O.: A novel method for detection of tuberculosis in chest radiographs using artificial ecosystem-based optimisation of deep neural network features. Symmetry 12(7), 1146 (2020)
Taslimi, S., Taslimi, S., Fathi, N., Salehi, M., Rohban, M.H.: Swinchex: multi-label classification on chest x-ray images with transformers. arXiv preprint arXiv:2206.04246 (2022)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jegou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference On Machine Learning, vol. 139, pp. 7358–7367. ELECTR NETWORK, JUL 18–24 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems. vol. 30. Curran Associates, Inc. (2017)
Xiao, J., Bai, Y., Yuille, A., Zhou, Z.: Delving into masked autoencoders for multi-label thorax disease classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 3588–3600 (January), Los Angeles, CA (2023)
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July), Honolulu, HI (2017)
Yang, M., Tanaka, H., Ishida, T.: Performance improvement in multi-label thoracic abnormality classification of chest x-rays with noisy labels. Int. J. Comput. Assist. Radiol. Surg. 18(1, SI), 181–189 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, X. et al. (2024). CheXNet: Combing Transformer and CNN for Thorax Disease Diagnosis from Chest X-ray Images. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14437. Springer, Singapore. https://doi.org/10.1007/978-981-99-8558-6_7
Download citation
DOI: https://doi.org/10.1007/978-981-99-8558-6_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8557-9
Online ISBN: 978-981-99-8558-6
eBook Packages: Computer ScienceComputer Science (R0)