Abstract
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, F., et al.: One transformer fits all distributions in multi-modal diffusion at scale. arXiv preprint arXiv:2303.06555 (2023)
Campello, V.M., et al.: Multi-centre, multi-vendor and multi-disease cardiac segmentation: the m &ms challenge. IEEE Trans. Med. Imaging 40(12), 3543–3554 (2021)
Chen, B., Thandiackal, K., Pati, P., Goksel, O.: Generative appearance replay for continual unsupervised domain adaptation. arXiv preprint arXiv:2301.01211 (2023)
Douillard, A., Chen, Y., Dapogny, A., Cord, M.: Plop: learning without forgetting for continual semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4040–4050 (2021)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Huang, Z., et al.: Stu-net: scalable and transferable medical image segmentation models empowered by large-scale supervised pre-training. arXiv preprint arXiv:2304.06716 (2023)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Nat. Acad. Sci. 114(13), 3521–3526 (2017)
Li, K., Yu, L., Heng, P.A.: Domain-incremental cardiac image segmentation with style-oriented replay and domain-sensitive feature whitening. IEEE Trans. Med. Imaging 42(3), 570–581 (2022)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)
Liu, J., et al.: Clip-driven universal model for organ segmentation and tumor detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 21152–21164 (2023)
Liu, P., et al.: Learning incrementally to segment multiple organs in a ct image. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, pp. 714–724. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-16440-8_68
Liu, Q., Dou, Q., Yu, L., Heng, P.A.: Ms-net: multi-site network for improving prostate segmentation with heterogeneous mri data. IEEE Trans. Med. Imaging 39(9), 2713–2724 (2020)
Liu, X., Shih, H.A., Xing, F., Santarnecchi, E., El Fakhri, G., Woo, J.: Incremental learning for heterogeneous structure segmentation in brain tumor mri. In: Greenspan, H., et al. (eds.) MICCAI 2023, vol. 14221, pp. 46–56. Springer, Heidleberg (2023). https://doi.org/10.1007/978-3-031-43895-0_5
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024)
Müller-Franzes, G., et al.: Diffusion probabilistic models beat gans on medical images. arXiv preprint arXiv:2212.07501 (2022)
Price, W.N., Cohen, I.G.: Privacy in the age of medical big data. Nat. Med. 25(1), 37–43 (2019)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, S., Yu, L., Li, K., Yang, X., Fu, C.W., Heng, P.A.: Dofe: domain-oriented feature embedding for generalizable fundus image segmentation on unseen datasets. IEEE Trans. Med. Imaging 39(12), 4237–4248 (2020)
Wu, H., Wang, Z., Zhao, Z., Chen, C., Qin, J.: Continual nuclei segmentation via prototype-wise relation distillation and contrastive learning. IEEE Trans. Med. Imaging 42, 3794–3804 (2023)
Zhang, J., et al.: S3r: shape and semantics-based selective regularization for explainable continual segmentation across multiple sites. IEEE Trans. Med. Imaging 42, 2539–2551 (2023)
Zhang, J., et al.: Jointnet: extending text-to-image diffusion for dense distribution modeling. arXiv preprint arXiv:2310.06347 (2023)
Zhang, J., et al.: Learning towards synchronous network memorizability and generalizability for continual segmentation across multiple sites. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, vol. 13435, pp. 380–390. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-16443-9_37
Zhao, D., Yuan, B., Shi, Z.: Inherit with distillation and evolve with contrast: exploring class incremental semantic segmentation without exemplar memory. IEEE Trans. Pattern Anal. Mach. Intell. 45, 11932–11947 (2023)
Acknowledgments
This research is partially supported by Specific Project of Shanghai Jiao Tong University for “Invigorating Inner Mongolia through Science and Technology" (2022XYJG0001-01–17), the funding from Star of SJTU Programme, a grant from the Researh Grants Council of the Hong Kong Special Administrative Region, China (Project No.: T45-401/22-N), and a grant from Hong Kong Innovation and Technology Fund (Project No.: MHP/085/21).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, W., Zhang, J., Heng, PA., Gu, L. (2024). Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15008. Springer, Cham. https://doi.org/10.1007/978-3-031-72111-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-72111-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72110-6
Online ISBN: 978-3-031-72111-3
eBook Packages: Computer ScienceComputer Science (R0)