Abstract
Accurate segmentation of lung nodules is the key to diagnosing the lesion type of lung nodule. The complex boundaries of lung nodules and the visual similarity to surrounding tissues make precise segmentation of lung nodules challenging. Traditional CNN based lung nodule segmentation models focus on extracting local features from neighboring pixels and ignore global contextual information, which is prone to incomplete segmentation of lung nodule boundaries. In the U-shaped encoder-decoder structure, variations of image resolution caused by up-sampling and down-sampling result in the loss of feature information, which reduces the reliability of output features. This paper proposes transformer pooling module and dual-attention feature reorganization module to effectively improve the above two defects. Transformer pooling module innovatively fuses the self-attention layer and pooling layer in the transformer, which compensates for the limitation of convolution operation, reduces the loss of feature information in the pooling process, and decreases the computational complexity of the Transformer significantly. Dual-attention feature reorganization module innovatively employs the dual-attention mechanism of channel and spatial to improve the sub-pixel convolution, minimizing the loss of feature information during up-sampling. In addition, two convolutional modules are proposed in this paper, which together with transformer pooling module form an encoder that can adequately extract local features and global dependencies. We use the fusion loss function and deep supervision strategy in the decoder to train the model. The proposed model has been extensively experimented and evaluated on the LIDC-IDRI dataset, the highest Dice Similarity Coefficient is 91.84 and the highest sensitivity is 92.66, indicating the model's comprehensive capability has surpassed state-of-the-art UTNet. The model proposed in this paper has superior segmentation performance for lung nodules and can provide a more in-depth assessment of lung nodules' shape, size, and other characteristics, which is of important clinical significance and application value to assist physicians in the early diagnosis of lung nodules.
Graphical Abstract















Similar content being viewed by others
References
Xiao Z, Liu B, Geng L et al (2020) Segmentation of lung nodules using improved 3D-UNet neural network. Symmetry 12(11):1787. https://doi.org/10.3390/sym12111787
Oudkerk M, Liu SY, Heuvelmans MA et al (2021) Lung cancer LDCT screening and mortality reduction—evidence, pitfalls and future perspectives. Nat Rev Clin Oncol 18(3):135–151. https://doi.org/10.1038/s41571-020-00432-6
Keetha NV, Annavarapu CSR (2020) U-Det: A modified U-Net architecture with bidirectional feature network for lung nodule segmentation. arXiv preprint arXiv:2003.09293. https://doi.org/10.48550/arXiv.2003.09293
Cao H, Liu H, Song E et al (2020) Dual-branch residual network for lung nodule segmentation. Appl Soft Comput 86:105934. https://doi.org/10.1016/j.asoc.2019.105934
Liu H, Geng F, Guo Q et al (2018) A fast weak-supervised pulmonary nodule segmentation method based on modified self-adaptive FCM algorithm. Soft Comput 22(12):3983–3995. https://doi.org/10.1007/s00500-017-2608-5
Amorim PHJ, Moraes TF, da Silva JVL et al (2019) Lung nodule segmentation based on convolutional neural networks using multi-orientation and patchwise mechanisms[C]//ECCOMAS Thematic Conference on Computational Vision and Medical Image Processing. Springer, Cham, pp 286–295. https://doi.org/10.1007/978-3-030-32040-9_30
Cao H, Wang Y, Chen J, et al. (2023) Swin-unet: Unet-like pure transformer for medical image segmentation[C]//Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Cham: Springer Nature Switzerland, pp 205-218. https://doi.org/10.1007/978-3-031-25066-8_9
Koutini K, Eghbal-Zadeh H, Dorfer M et al (2019) The receptive field as a regularizer in deep convolutional neural networks for acoustic scene classification[C]//2019 27th European signal processing conference (EUSIPCO). IEEE, pp 1–5. https://doi.org/10.23919/EUSIPCO.2019.8902732
Qiao S, Chen L C, Yuille A (2021) Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10213–10224. https://doi.org/10.48550/arXiv.2006.02334
Karimi D, Salcudean SE (2019) Reducing the hausdorff distance in medical image segmentation with convolutional neural networks. IEEE Trans Med Imaging 39(2):499–513. https://doi.org/10.1109/TMI.2019.2930068
Letcher A (2020) On the impossibility of global convergence in multi-loss optimization. arXiv preprint arXiv:2005.12649. https://doi.org/10.48550/arXiv.2005.12649
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
Zhou Z, Siddiquee MMR, Tajbakhsh N et al (2018) Unet++: a nested u-net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, pp 3–11. https://doi.org/10.1007/978-3-030-00889-5_1
Huang H, Lin L, Tong R et al (2020) Unet 3+: a full-scale connected unet for medical image segmentation[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405
Ibtehaz N, Rahman MS (2020) MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87. https://doi.org/10.1016/j.neunet.2019.08.025
Wang Z, Zou N, Shen D, et al. (2020) Non-local u-nets for biomedical image segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 34(04): 6315-6322. https://doi.org/10.1609/aaai.v34i04.6100
Tang H, Zhang C, Xie X (2019) Nodulenet: decoupled false positive reduction for pulmonary nodule detection and segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, pp 266–274. https://doi.org/10.1007/978-3-030-32226-7_30
Maqsood M, Yasmin S, Mehmood I et al (2021) An efficient DA-net architecture for lung nodule segmentation. Mathematics 9(13):1457. https://doi.org/10.3390/math9131457
Banu SF, Sarker M, Kamal M et al (2021) AWEU-Net: An Attention-Aware Weight Excitation U-Net for Lung Nodule Segmentation. Appl Sci 11(21):10132. https://doi.org/10.3390/app112110132
Dhamija T, Gupta A, Gupta S, et al (2023) Semantic segmentation in medical images through transfused convolution and transformer networks. Appl Intell 53(1):1132–1148. https://doi.org/10.1007/s10489-022-03642-w
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Proces Syst 30. https://doi.org/10.48550/arXiv.1706.03762
Wu Y H, Liu Y, Zhan X, et al. (2022) P2T: pyramid pooling transformer for scene understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2022.3202765
Wang W, Xie E, Li X et al (2022) Pvt v2: Improved baselines with pyramid vision transformer. Comput Vis Media 8(3):415–424. https://doi.org/10.1007/s41095-022-0274-8
Chen J, Lu Y, Yu Q, et al. (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306. https://doi.org/10.48550/arXiv.2102.04306
Wang H, Cao P, Wang J, et al. (2022) Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 36(3): 2441–2449. https://doi.org/10.1609/aaai.v36i3.20144
Gao Y, Zhou M, Metaxas DN (2021) UTNet: a hybrid transformer architecture for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, pp 61–71. https://doi.org/10.1007/978-3-030-87199-4_6
Kirillov A, Wu Y, He K, et al (2020) Pointrend: image segmentation as rendering[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9799–9808. https://doi.org/10.48550/arXiv.1912.08193
Im D, Han D, Choi S et al (2019) DT-CNN: dilated and transposed convolution neural network accelerator for real-time image segmentation on mobile devices[C]//2019 IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–5. https://doi.org/10.1109/ISCAS.2019.8702243
Xiong S, Wu X, Chen H et al (2021) Bi-directional skip connection feature pyramid network and sub-pixel convolution for high-quality object detection. Neurocomputing 440:185–196. https://doi.org/10.1016/j.neucom.2021.01.021
Tian Z, He T, Shen C, et al (2019) Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 3126–3135. https://doi.org/10.48550/arXiv.1903.02120
Wang J, Chen K, Xu R, et al (2021) CARAFE++: Unified Content-Aware ReAssembly of FEatures. IEEE Trans Pattern Anal Mach Intell 44(9):4674–4687. https://doi.org/10.1109/TPAMI.2021.3074370
Wang J, Chen K, Xu R, et al (2019) Carafe: content-aware reassembly of features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 3007–3016. https://doi.org/10.48550/arXiv.1905.02188
Bang S, Park S, Kim H et al (2019) Encoder–decoder network for pixel-level road crack detection in black-box images. Comp-Aided Civil Infrastruct Eng 34(8):713–727. https://doi.org/10.1111/mice.12440
Ding Y, Ma Z, Wen S et al (2021) AP-CNN: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans Image Process 30:2826–2836. https://doi.org/10.1109/TIP.2021.3055617
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1251–1258. https://doi.org/10.48550/arXiv.1610.02357
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132–7141. https://doi.org/10.48550/arXiv.1709.01507
Al-Shabi M, Lan BL, Chan WY et al (2019) Lung nodule classification using deep local–global networks. Int J Comput Assist Radiol Surg 14(10):1815–1819. https://doi.org/10.1007/s11548-019-01981-7
Luo P, Wang X, Shao W, et al. (2018) Towards understanding regularization in batch normalization. arXiv preprint arXiv:1809.00846. https://doi.org/10.48550/arXiv.1809.00846
Liu Y, Sangineto E, Bi W, et al (2021) Efficient training of visual transformers with small-size datasets. Adv Neural Inf Proces Syst 34:23818–23830. https://doi.org/10.48550/arXiv.2106.03746
Bello I, Zoph B, Vaswani A, et al. (2019) Attention augmented convolutional networks[C]//Proceedings of the IEEE/CVF international conference on computer vision. pp 3286–3295. https://doi.org/10.48550/arXiv.1904.09925
Zhang Y, Higashita R, Fu H et al (2021) A Multi-branch Hybrid Transformer Network for Corneal Endothelial Cell Segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, pp 99–108. https://doi.org/10.48550/arXiv.1904.09925
Woo S, Park J, Lee J Y, et al (2018) Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Sun Y, Chen J, Liu Q et al (2020) Learning image compressed sensing with sub-pixel convolutional generative adversarial network. Pattern Recognit 98:107051. https://doi.org/10.1016/j.patcog.2019.107051
Acknowledgements
We are very grateful to each editor and reviewer for their valuable comments, which are the main driving force behind the progress of this paper. In addition, we thank Xiaotong Li for his suggestions on revising the paper.
Funding
This work is supported by the Joint Funds of the National Natural Science Foundation of China (No. U21A20469) and the Fundamental Research Program of Shanxi Province (No. 202203021211177).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study meets the requirements of The Code of Ethics of the World Medical Association.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., Jiang, A., Qiu, Y. et al. TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization. Med Biol Eng Comput 61, 1929–1946 (2023). https://doi.org/10.1007/s11517-023-02852-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-023-02852-9