A pixel and channel enhanced up-sampling module for biomedical image segmentation

Zhang, Xuan; Xu, Guoping; Wu, Xinglong; Liao, Wentao; Leng, Xuesong; Wang, Xiaxia; He, Xinwei; Li, Chang

doi:10.1007/s00138-024-01513-7

A pixel and channel enhanced up-sampling module for biomedical image segmentation

Research
Published: 22 February 2024

Volume 35, article number 30, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Xuan Zhang¹,
Guoping Xu¹,
Xinglong Wu¹,
Wentao Liao¹,
Xuesong Leng¹,
Xiaxia Wang¹,
Xinwei He² &
…
Chang Li³

271 Accesses
1 Altmetric
Explore all metrics

Abstract

Up-sampling operations are frequently utilized to recover the spatial resolution of feature maps in neural networks for segmentation task. However, current up-sampling methods, such as bilinear interpolation or deconvolution, do not fully consider the relationship of feature maps, which have negative impact on learning discriminative features for semantic segmentation. In this paper, we propose a pixel and channel enhanced up-sampling (PCE) module for low-resolution feature maps, aiming to use the relationship of adjacent pixels and channels for learning discriminative high-resolution feature maps. Specifically, the proposed up-sampling module includes two main operations: (1) increasing spatial resolution of feature maps with pixel shuffle and (2) recalibrating channel-wise high-resolution feature response. Our proposed up-sampling module could be integrated into CNN and Transformer segmentation architectures. Extensive experiments on three different modality datasets of biomedical images, including computed tomography (CT), magnetic resonance imaging (MRI) and micro-optical sectioning tomography images (MOST) demonstrate the proposed method could effectively improve the performance of representative segmentation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

Data Availability

The authors confirm that the Synapse and MSD data supporting the findings of this study are available within the article. But the MOST data are not publicly available.

Code Availability

Not applicable.

Notes

References

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018). https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous Convolution for Semantic Image Segmentation (2017). https://doi.org/10.48550/arXiv.1706.05587
Jarujareet, U., Wiratchawa, K., Panpisut, P., Intharah, T.: Deepddm: A compact deep-learning assisted platform for micro-rheological assessment of micro-volume fluids. IEEE Access 11, 66467–66477 (2023). https://doi.org/10.1109/ACCESS.2023.3290496
Article Google Scholar
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H.S., Zhang, L.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6881–6890 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022 (2021)
Lin, A., Chen, B., Xu, J., Zhang, Z., Lu, G., Zhang, D.: Ds-transunet: Dual swin transformer u-net for medical image segmentation. IEEE Trans. Instrum. Meas. 71, 1–15 (2022). https://doi.org/10.1109/TIM.2022.3178991
Article Google Scholar
Poudel, R.P.K., Liwicki, S., Cipolla, R.: Fast-SCNN: Fast Semantic Segmentation Network (2019). https://doi.org/10.48550/arXiv.1902.04502
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Paszke, A., Chaurasia, A., Kim, S., Culurciello, E.: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation (2016). https://doi.org/10.48550/arXiv.1606.02147
Zhang, X., Chen, Z., Wu, Q.M.J., Cai, L., Lu, D., Li, X.: Fast semantic segmentation for scene perception. IEEE Trans. Industr. Inf. 15(2), 1183–1192 (2019). https://doi.org/10.1109/TII.2018.2849348
Article Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1520–1528 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Article Google Scholar
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) Computer Vision – ECCV 2022 Workshops, pp. 205–218. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-25066-8_9
Chen, M.-J., Huang, C.-H., Lee, W.-L.: A fast edge-oriented algorithm for image interpolation. Image Vis. Comput. 23(9), 791–798 (2005). https://doi.org/10.1016/j.imavis.2005.05.005
Article Google Scholar
Asuni, N., Giachetti, A.: Accuracy improvements and artifacts removal in edge based image interpolation. In: Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008), vol. 2, pp. 58–65. SciTePress, Funchal (2008). https://doi.org/10.5220/0001074100580065
Seo, H., Huang, C., Bassenne, M., Xiao, R., Xing, L.: Modified u-net (mu-net) with incorporation of object-dependent high level features for improved liver and liver-tumor segmentation in ct images. IEEE Trans. Med. Imaging 39(5), 1316–1325 (2020). https://doi.org/10.1109/TMI.2019.2948320
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2018)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11534–11542 (2020)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Li, A., Gong, H., Zhang, B., Wang, Q., Yan, C., Wu, J., Liu, Q., Zeng, S., Luo, Q.: Micro-optical sectioning tomography to obtain a high-resolution atlas of the mouse brain. Science 330(6009), 1404–1408 (2010). https://doi.org/10.1126/science.1191776
Article Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)
Chaurasia, A., Culurciello, E.: Linknet: Exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4 (2017). https://doi.org/10.1109/VCIP.2017.8305148
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation (2021). https://doi.org/10.48550/arXiv.2102.04306
Xu, G., Wu, X., Zhang, X., He, X.: LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation (2021). https://doi.org/10.48550/arXiv.2107.08623
Wang, Y., Zhou, Q., Xiong, J., Wu, X., Jin, X.: Esnet: An efficient symmetric network for real-time semantic segmentation. In: Lin, Z., Wang, L., Yang, J., Shi, G., Tan, T., Zheng, N., Chen, X., Zhang, Y. (eds.) Pattern Recognition and Computer Vision, pp. 41–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31723-2_4
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

Download references

Acknowledgements

This work is supported by the Guangdong Provincial Key Laboratory of Human Digital Twin (No. 2022B1212010004), the Open-Fund of WNLO (Grant No. 2018WNLOKF027) and the Graduate Innovative Fund of Wuhan Institute of Technology (No. CX2022349). We thank the Optical Bioimaging Core Facility of WNLO-HUST for the support in MOST data acquisition.

Funding

Funding for this study was received from the Guangdong Provincial Key Laboratory of Human Digital Twin (No. 2022B1212010004), the Fundamental Research Funds for the Central Universities of China (Grant No. PA2023IISL0095) and the Graduate Innovative Fund of Wuhan Institute of Technology (No.CX2022349).

Author information

Authors and Affiliations

School of Computer Sciences and Engineering, Hubei Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Street, Wuhan, 430205, Hubei, China
Xuan Zhang, Guoping Xu, Xinglong Wu, Wentao Liao, Xuesong Leng & Xiaxia Wang
College of Informatics, Huazhong Agricultural University, Street, Wuhan, 430070, Hubei, China
Xinwei He
Intelligent Interconnected Systems Laboratory of Anhui Province, Hefei University of Technology, Street, Hefei, 230009, Anhui, China
Chang Li

Authors

Xuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guoping Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xinglong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Liao
View author publications
You can also search for this author in PubMed Google Scholar
Xuesong Leng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaxia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xinwei He
View author publications
You can also search for this author in PubMed Google Scholar
Chang Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Not applicable.

Corresponding author

Correspondence to Guoping Xu.

Ethics declarations

Conflict of interest

We confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 4 Ablation experiments on different components of the PCE module on the MOST dataset (where PS stands for the addition of only the pixel shuffle operation and GRE stands for the addition of the global relationship-enhanced operation after bilinear up-sampling)

Full size table

Table 5 Ablation experiments on different components of the PCE module on the MOST dataset (where SE and ECA are the result of adding other attention mechanisms on top of PS)

Full size table

The details how to plug into the PCE into other architectures are presented in the appendix part. Specifically, the PCE module proposed in this paper can be seamlessly integrated into other segmentation architectures, as illustrated in Figs. 5 and 6. The PCE module can directly replace the up-sampling module (indicated by the red block and red arrows in the figures) in U-Net and Fast-SCNN models, improving segmentation accuracy compared to traditional up-sampling modules.

The two different phases in the PCE module provide different functions, and in order to determine the contribution of the key components in the PCE to its success, ablation experiments were performed on both aspects as shown in Table 4. We found that it could increase 0.82% Dice by integrating with the global relationship in our up-sampling module. Moreover, the experimental results demonstrate the PS (Pixel Shuffle) in our PCE module is more critical for segmentation task.

The experiments demonstrate the effectiveness of the global relationship enhancement feature. In addition, we conducted ablation experiments on the attention mechanism in the PCE module, as shown in Table 5. After replacing the attention mechanism in the PCE module with SE or ECA in turn, the segmentation results are all decreased. The segmentation result of the attention mechanism proposed in this paper is 87.79%. Compared with other attention mechanisms, the method of enhancing global relations proposed in this paper is more advantageous.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Xu, G., Wu, X. et al. A pixel and channel enhanced up-sampling module for biomedical image segmentation. Machine Vision and Applications 35, 30 (2024). https://doi.org/10.1007/s00138-024-01513-7

Download citation

Received: 28 August 2023
Revised: 15 December 2023
Accepted: 23 January 2024
Published: 22 February 2024
DOI: https://doi.org/10.1007/s00138-024-01513-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A pixel and channel enhanced up-sampling module for biomedical image segmentation

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Methods for image denoising using convolutional neural network: a review

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A pixel and channel enhanced up-sampling module for biomedical image segmentation

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Methods for image denoising using convolutional neural network: a review

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation