Abstract
Lung cancer becomes the most prominent cause of cancer-related death in society. Normally, radiologists use computed tomography (CT) to diagnose lung nodules in lung cancer patients. A single CT scan for a patient produces hundreds of images that are manually analyzed by radiologists which is a big burden and sometimes leads to inaccuracy. Recently, many computer-aided diagnosis (CAD) systems integrated with deep learning architectures have been proposed to assist radiologists. This study proposes the CAD scheme based on a 3D multi-scale vision transformer (3D-MSViT) to enhance multi-scale feature extraction and improves lung nodule prediction efficiency from 3D CT images. The 3D-MSViT architecture adopted a local–global transformer block structure whereby the local transformer stage individually processes each scale patch and forwards it to the global transformer level for merging multi-scale features. The transformer blocks fully relied on the attention mechanism without the inclusion of the convolutional neural network to reduce the network parameters. The proposed CAD scheme was validated on 888 CT images of the Lung Nodule Analysis 2016 (LUNA16) public dataset. Free-response receiver operating characteristics analysis was adopted to evaluate the proposed method. The 3D-MSViT algorithm obtained the highest sensitivity of 97.81% and competition performance metrics of 0.911. Therefore, the 3D-MSViT scheme obtained comparable results with low network complexity related to the counterpart deep learning approaches in prior studies.



Similar content being viewed by others
Availability of data and materials
The research work uses the LUNA16 dataset that is available online through https://luna16.grand-challenge.org/.
References
Siegel, R.L., Miller, K.D., Fuchs, H.E., Jemal, A.: Cancer statistics, 2022. CA. Cancer J. Clin. 72(1), 7–33 (2022). https://doi.org/10.3322/caac.21708
Valente, I.R.S., Cortez, P.C., Neto, E.C., Soares, J.M., de Albuquerque, V.H.C., Tavares, J.M.R.S.: Automatic 3D pulmonary nodule detection in CT images: a survey. Comput. Methods Programs Biomed. 124, 91–107 (2016). https://doi.org/10.1016/j.cmpb.2015.10.006
Wang, Q., Zuo, M.: A novel variational optimization model for medical CT and MR image fusion. Signal Image Video Process. (2022). https://doi.org/10.1007/s11760-022-02220-4
Trung, N.T., Trinh, D.H., Trung, N.L., Luong, M.: Low-dose CT image denoising using deep convolutional neural networks with extended receptive fields. Signal Image Video Process. (2022). https://doi.org/10.1007/s11760-022-02157-8
Setio, A.A.A., et al.: Pulmonary Nodule Detection in CT Images: False Positive Reduction Using Multi-View Convolutional Networks. IEEE Trans. Med. Imaging 35(5), 1160–1169 (2016). https://doi.org/10.1109/TMI.2016.2536809
Jiang, H., Ma, H., Qian, W., Gao, M., Li, Y.: An automatic detection system of lung nodule based on multigroup patch-based deep learning network. IEEE J. Biomed. Heal. Inform. 22(4), 1227–1237 (2018). https://doi.org/10.1109/JBHI.2017.2725903
Dutande, P., Baid, U., Talbar, S.: LNCDS: A 2D–3D cascaded CNN approach for lung nodule classification, detection and segmentation. Biomed. Signal Process. Control 67, 102527 (2021). https://doi.org/10.1016/j.bspc.2021.102527
Mittapalli, P.S., Thanikaiselvan, V.: Multiscale CNN with compound fusions for false positive reduction in lung nodule detection. Artif. Intell. Med. 113, 102017 (2021). https://doi.org/10.1016/j.artmed.2021.102017
Mehta, K., Jain, A., Mangalagiri, J., Menon, S., Nguyen, P., Chapman, D.R.: Lung nodule classification using biomarkers, volumetric radiomics, and 3D CNNs. J. Digit. Imaging (2021). https://doi.org/10.1007/s10278-020-00417-y
Liu, J., Gong, J., Wang, L., Sun, X., Nie, S.: Segmentation refinement of small-size juxta-pleural lung nodules in CT scans. Iran. J. Radiol. (2019). https://doi.org/10.5812/iranjradiol.65034
Gu, Y., et al.: Automatic lung nodule detection using multi-scale dot nodule-enhancement filter and weighted support vector machines in chest computed tomography. PLoS ONE 14(1), e0210551 (2019). https://doi.org/10.1371/journal.pone.0210551
Lu, L., Tan, Y., Schwartz, L.H., Zhao, B.: Hybrid detection of lung nodules on CT scan images. Med. Phys. 42(9), 5042–5054 (2015). https://doi.org/10.1118/1.4927573
Murphy, K., van Ginneken, B., Schilham, A.M.R., de Hoop, B.J., Gietema, H.A., Prokop, M.: A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Med. Image Anal. 13(5), 757–770 (2009). https://doi.org/10.1016/j.media.2009.07.001
De Moura, J., et al.: Multi-view multi-scale CNNs for lung nodule type classification from CT images. IEEE Trans. Med. Imaging 7(1), 1–12 (2018). https://doi.org/10.1117/12.2285954
Xie, H., Yang, D., Sun, N., Chen, Z., Zhang, Y.: Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recognit. 85, 109–119 (2019). https://doi.org/10.1016/j.patcog.2018.07.031
Zuo, W., Zhou, F., Li, Z., Wang, L.: Multi-resolution CNN and knowledge transfer for candidate classification in lung nodule detection. IEEE Access 7(c), 32510–32521 (2019). https://doi.org/10.1109/ACCESS.2019.2903587
Yu, L., Dou, Q., Chen, H., Heng, P.-A., Qin, J.: Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans. Biomed. Eng. 64(7), 1558–1567 (2016). https://doi.org/10.1109/tbme.2016.2613502
Zhang, H., Zhang, H.: LungSeek: 3D Selective Kernel residual network for pulmonary nodule diagnosis. Vis. Comput. (2022). https://doi.org/10.1007/s00371-021-02366-1
Zhu, W., Liu, C., Fan, W., Xie, X.: DeepLung: deep 3D dual path nets for automated pulmonary nodule detection and classification. In: Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018, vol. pp. 673–681, 2018-Jan. https://doi.org/10.1109/WACV.2018.00079.
Zhang, M., Kong, Z., Zhu, W., Yan, F., Xie, C.: Pulmonary nodule detection based on 3D feature pyramid network with incorporated squeeze-and-excitation-attention mechanism. Concurr. Comput. (2021). https://doi.org/10.1002/cpe.6237
Qin, R., et al.: Fine-grained lung cancer classification from PET and CT images based on multidimensional attention mechanism. Complexity (2020). https://doi.org/10.1155/2020/6153657
Gong, L., Jiang, S., Yang, Z., Zhang, G., Wang, L.: Automated pulmonary nodule detection in CT images using 3D deep squeeze-and-excitation networks. Int. J. Comput. Assist. Radiol. Surg. 14(11), 1969–1979 (2019). https://doi.org/10.1007/s11548-019-01979-1
Huang, Y.S., Chou, P.R., Chen, H.M., Chang, Y.C., Chang, R.F.: One-stage pulmonary nodule detection using 3-D DCNN with feature fusion and attention mechanism in CT image. Comput. Methods Programs Biomed. 220, 106786 (2022). https://doi.org/10.1016/j.cmpb.2022.106786
Zhu, X., Wang, X., Shi, Y., Ren, S., Wang, W.: Channel-wise attention mechanism in the 3D convolutional network for lung nodule detection. Electronics 11(10), 1600 (2022). https://doi.org/10.3390/electronics11101600
Luo, X., et al.: SCPM-Net: an anchor-free 3D lung nodule detection network using sphere representation and center points matching. Med. Image Anal. (2022). https://doi.org/10.1016/j.media.2021.102287
Vaswani, A.: Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA., 2017, no. Nips (2017)
Zhai, X. et al.: Vision Transformer, arXiv:2010.11929 (2021).
Wang, B., Wang, F., Dong, P., Li, C.: Multiscale transunet++: dense hybrid U-Net with transformer for medical image segmentation. Signal Image Video Process. (2022). https://doi.org/10.1007/s11760-021-02115-w
Wu, M., Qian, Y., Liao, X., Wang, Q., Heng, P.-A.: Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention, 2021, [Online]. Available: http://arxiv.org/abs/2111.03368
Liu, Z. et al.: Swin transformer: hierarchical vision transformer using shifted windows, arXiv:2103.14030 (2021).
Kekeke, et al.: T5: exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1–67 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Janocha, K., Czarnecki, W.M.: On loss functions for deep neural networks in classification. Schedae Informaticae 25, 49–59 (2016). https://doi.org/10.4467/20838476SI.16.004.6185
LIDC-IDRI—The Cancer Imaging Archive (TCIA) Public Access 2021. https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI.
Liu, K., Kang, G.: Multiview convolutional neural networks for lung nodule classification. Int. J. Imaging Syst. Technol. 27(1), 12–22 (2017). https://doi.org/10.1002/ima.22206
Acknowledgements
The authors would like to thank the Editors and synonyms reviewers for their constructive comments on improving this work.
Funding
This work is supported by The National Natural Science Foundation of China under Grant Numbers 61671185 and 62071153.
Author information
Authors and Affiliations
Contributions
HM: Methodology, formal analysis, and writing—original draft preparation. LW: The manuscript investigation, writing—review, and editing. YZ: Conceptualization, resources, funding acquisition, and supervision.
Corresponding author
Ethics declarations
Conflict of interests
We declare that we have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mkindu, H., Wu, L. & Zhao, Y. 3D multi-scale vision transformer for lung nodule detection in chest CT images. SIViP 17, 2473–2480 (2023). https://doi.org/10.1007/s11760-022-02464-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02464-0