Abstract
Multi-task learning for joint medical image segmentation and classification holds promise for enhancing diagnostic accuracy and reliability in clinical settings. Current approaches often rely on unidirectionally bootstrapping one task with a single high-level feature from another, which fails to leverage valuable information fully and can lead to suboptimal outcomes and diagnostic errors. Multi-task learning frameworks based on Convolutional Neural Networks (CNNs) or Vision Transformers (ViTs) have achieved significant success in medical image analysis. However, CNNs struggle with long sequence information, and ViTs are computationally intensive. Based on this, in this paper, we propose a novel Uncertainty Bidirectional Guidance of multi-task Mamba network (UBGM) for efficient and reliable medical image analysis. UBGM’s encoder utilizes a Mamba structure, excelling in remote modeling and maintaining computational efficiency with linear complexity. The uncertainty coarse segmentation guidance module performs interactive learning between tasks, generating multiple high-level features for classification and coarse segmentation results by incorporating uncertainty. To better utilize segmentation information, we designed an uncertainty classification decoder to produce category information and features for assisted segmentation correction. Real bidirectional guidance is achieved by the mutual assistance of both tasks, improving model performance. Experiments on public datasets demonstrate that UBGM outperforms existing benchmark models, showing potential for high performance and reliability in multi-task networks.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Wang, Q., Yin, C., Song, H., et al.: Utfnet: Uncertainty-guided trustworthy fusion network for rgb-thermal semantic segmentation. IEEE Geosci. Remote Sens. Lett. 20, 1–5 (2023). https://doi.org/10.1109/LGRS.2023.3322452
Ren, K., Zou, K., Liu, X., et al.: Uncertainty-informed mutual learning for joint medical image classification and segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 35–45. Springer, (2023). https://doi.org/10.1007/978-3-031-43901-8_4
Wang, X., Jiang, L., Xu, M., et al.: Joint learning of 3d lesion segmentation and classification for explainable Covid-19 diagnosis. IEEE Trans. Med. Imaging 05, 1–1 (2021). https://doi.org/10.1109/TMI.2021.3079709
Inan, N.G., Kocadağlı, O., et al.: Multi-class classification of thyroid nodules from automatic segmented ultrasound images: hybrid resnet based unet convolutional neural network approach. Comput. Methods Programs Biomed. 243, 107921 (2024). https://doi.org/10.1016/j.cmpb.2023.107921
Zhang, K., Wang, B.: Classification task assisted segmentation network for breast tumor segmentation in ultrasound images. ICIP (2023). https://doi.org/10.1109/ICIP49359.2023.10222770
Bakkouri, I., Afdel, K.: DermoNet: a computer-aided diagnosis system for Dermoscopic disease recognition. In: El Moataz, A., Mammass, D., Mansouri, A., Nouboud, F. (eds.) Image and Signal Processing: 9th International Conference, ICISP 2020, Marrakesh, Morocco, June 4–6, 2020, Proceedings, pp. 170–177. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-51935-3_18
Bakkouri, I., Bakkouri, S.: 2mgas-net: multi-level multi-scale gated attentional squeezed network for polyp segmentation. SIViP (2024). https://doi.org/10.1007/s11760-024-03240-y
Han, Z., Zhang, C., et al.: Trusted multi-view classification with dynamic evidential fusion. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2551–2566 (2023). https://doi.org/10.1109/TPAMI.2022.3171983
Wang, M., Lin, T., Wang, L., Lin, A., Zou, K., Xinxing, X., Zhou, Y., Peng, Y., Meng, Q., Qian, Y., et al.: Uncertainty-inspired open set learning for retinal anomaly identification. Nat. Commun. 14(1), 6757 (2023). https://doi.org/10.1038/s41467-023-42444-7
Zou, K., Yuan, X., et al.: Evidencecap: towards trustworthy medical image segmentation via evidential identity cap. (2023). Preprint at https://doi.org/10.21203/rs.3.rs-2558155/v1
Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 3183-3193, Red Hook, NY, USA, (2018). Curran Associates Inc
Zou, K., Yuan, Xuedong, et al.: Tbrats: Trusted brain tumor segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 503–513. Springer, (2022). https://doi.org/10.1007/978-3-031-16452-1_48
Ruan, J., Xie, M., et al.: Ege-unet: an efficient group enhanced unet for skin lesion segmentation. In: International conference on medical image computing and computer-assisted intervention, pages 481–490. Springer, (2023). https://doi.org/10.1007/978-3-031-43901-8_46
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. Lecture Notes Comput. Sci. (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Ruan, J., Xiang, S., et al.: Malunet: A multi-attention and light-weight unet for skin lesion segmentation. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 1150–1156. IEEE, (2022). https://doi.org/10.1109/BIBM55620.2022.9995040
Dosovitskiy, A., Beyer, L., et al.: An image is worth 16x16 words: Transformers for image recognition at scale, (2021). https://arxiv.org/abs/2010.11929
Chen, F., Han, H., Wan, P., et al.: Joint segmentation and differential diagnosis of thyroid nodule in contrast-enhanced ultrasound images. IEEE Trans. Biomed. Eng. 70(9), 2722–2732 (2023). https://doi.org/10.1109/TBME.2023.3262842
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9992–10002, Los Alamitos, CA, USA, (2021). IEEE Computer Society. https://doi.org/10.1109/ICCV48922.2021.00986
Liu, Y., Tian, Yunjie, et al.: Vmamba: Visual state space model, (2024). https://arxiv.org/abs/2401.10166
Gu, A., Dao, T.: Mamba: Linear-time sequence modeling with selective state spaces, (2024). https://arxiv.org/abs/2312.00752
Ma, J., Li, F., Wang, B.: U-mamba: Enhancing long-range dependency for biomedical image segmentation, (2024). https://arxiv.org/abs/2401.04722
Ruan, J., Xiang, S.: Vm-unet: Vision mamba unet for medical image segmentation, (2024). https://arxiv.org/abs/2402.02491
Laurent, S.: Rigid-motion scattering for image classification. Ph. D. thesis section, 6(2), (2014)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization, (2016). https://arxiv.org/abs/1607.06450
Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018). https://doi.org/10.1016/j.neunet.2017.12.012
Jøsang, A.: Subjective logic, volume 3. Springer, (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, (2016). https://doi.org/10.1109/CVPR.2016.90
Al-Dhabyani, W., Gomaa, M., et al.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020). https://doi.org/10.1016/j.dib.2019.104863
Orlando, J.I., Fu, H., et al.: Refuge challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal. 59, 101570 (2020). https://doi.org/10.1016/j.media.2019.101570
Yue, Y., Li, Z.: Medmamba: Vision mamba for medical image classification. arXiv preprint arXiv:2403.03849, (2024). https://arxiv.org/abs/2403.03849
Zhu, M., Chen, Z., Yuan, Y.: Dsi-net: deep synergistic interaction network for joint classification and segmentation with endoscope images. IEEE Trans. Med. Imaging 40(12), 3315–3325 (2021). https://doi.org/10.1109/TMI.2021.3083586
Yang, K., Suzuki, A., et al.: Multi-task learning with consistent prediction for efficient breast ultrasound tumor detection. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 3201–3208. IEEE, (2022). https://doi.org/10.1109/BIBM55620.2022.9995444
Xu, M., Huang, K., Qi, X.: Multi-task learning with context-oriented self-attention for breast ultrasound image classification and segmentation. In: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, (2022). https://doi.org/10.1109/ISBI52829.2022.9761685
Liu, S., Deng, W.: Very deep convolutional neural network based image classification using small training sample size. In: 2015 3rd IAPR Asian conference on pattern recognition (ACPR), pages 730–734. IEEE, (2015). https://doi.org/10.1109/ACPR.2015.7486599
Author information
Authors and Affiliations
Contributions
Wu was in charge of all the manuscript writing as well as the experiments. Gou reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors did not receive support from any organization for subbmitted work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, X., Gou, G. Uncertainty bidirectional guidance of multi-task mamba network for medical image classification and segmentation. SIViP 19, 29 (2025). https://doi.org/10.1007/s11760-024-03633-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03633-z