Masked Frequency Consistency for Domain-Adaptive Semantic Segmentation of Laparoscopic Images

Zhao, Xinkai; Hayashi, Yuichiro; Oda, Masahiro; Kitasaka, Takayuki; Mori, Kensaku

doi:10.1007/978-3-031-43907-0_63

Xinkai Zhao¹⁴,
Yuichiro Hayashi¹⁴,
Masahiro Oda^14,15,
Takayuki Kitasaka¹⁶ &
…
Kensaku Mori^14,17,18

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14220))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

7950 Accesses

Abstract

Semantic segmentation of laparoscopic images is an important issue for intraoperative guidance in laparoscopic surgery. However, acquiring and annotating laparoscopic datasets is labor-intensive, which limits the research on this topic. In this paper, we tackle the Domain-Adaptive Semantic Segmentation (DASS) task, which aims to train a segmentation network using only computer-generated simulated images and unlabeled real images. To bridge the large domain gap between generated and real images, we propose a Masked Frequency Consistency (MFC) module that encourages the network to learn frequency-related information of the target domain as additional cues for robust recognition. Specifically, MFC randomly masks some high-frequency information of the image to improve the consistency of the network’s predictions for low-frequency images and real images. We conduct extensive experiments on existing DASS frameworks with our MFC module and show performance improvements. Our approach achieves comparable results to fully supervised learning method on the CholecSeg8K dataset without using any manual annotation. The code is available at github.com/MoriLabNU/MFC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Bayesian Approach to Weakly-Supervised Laparoscopic Image Segmentation

EasyLabels: weak labels for scene segmentation in laparoscopic videos

Article 04 June 2019

Progressive Frequency-Aware Network for Laparoscopic Image Desmoking

References

Aklilu, J., Yeung, S.: ALGES: active learning with gradient embeddings for semantic segmentation of laparoscopic surgical images. In: Proceedings of Machine Learning for Healthcare, pp. 892–911. PMLR (2022)
Google Scholar
Araslanov, N., Roth, S.: Self-supervised augmentation consistency for adapting semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15384–15394. IEEE (2021)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(04), 834–848 (2018)
Article Google Scholar
Contributors, M.: MMSegmentation: openmmlab semantic segmentation toolbox and benchmark (2020). https://github.com/open-mmlab/mmsegmentation
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009. IEEE (2022)
Google Scholar
Hong, W.Y., Kao, C.L., Kuo, Y.H., Wang, J.R., Chang, W.L., Shih, C.S.: CholecSeg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on Cholec80. arXiv preprint arXiv:2012.12453 (2020)
Hoyer, L., Dai, D., Van Gool, L.: DaFormer: improving network architectures and training strategies for domain-adaptive semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9924–9935. IEEE (2022)
Google Scholar
Hoyer, L., Dai, D., Van Gool, L.: HRDA: context-aware high-resolution domain-adaptive semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13690, pp. 372–391. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20056-4_22
Hoyer, L., Dai, D., Wang, H., Van Gool, L.: MIC: masked image consistency for context-enhanced domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11721–11732. IEEE (2023)
Google Scholar
Hu, S., Liao, Z., Xia, Y.: Domain specific convolution and high frequency reconstruction based unsupervised domain adaptation for medical image segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13437, pp. 650–659. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_62
Liu, J., Guo, X., Yuan, Y.: Prototypical interaction graph for unsupervised domain adaptation in surgical instrument segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12903, pp. 272–281. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_26
Chapter Google Scholar
Liu, Q., Chen, C., Qin, J., Dou, Q., Heng, P.A.: FedDG: federated domain generalization on medical image segmentation via episodic learning in continuous frequency space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1013–1023. IEEE (2021)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Lyu, J., Zhang, Y., Huang, Y., Lin, L., Cheng, P., Tang, X.: AADG: automatic augmentation for domain generalization on retinal image segmentation. IEEE Trans. Med. Imaging 41(12), 3699–3711 (2022)
Article Google Scholar
Madani, A., et al.: Artificial intelligence for intraoperative guidance: using semantic segmentation to identify surgical anatomy during laparoscopic cholecystectomy. Ann. Surg. 276(2), 363–369 (2022)
Article Google Scholar
Nussbaumer, H.J., Nussbaumer, H.J.: The fast fourier transform. Fast Fourier Transform and Convolution Algorithms, pp. 80–111 (1982)
Google Scholar
Pfeiffer, M., et al.: Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 119–127. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_14
Chapter Google Scholar
Qiu, J., Hayashi, Y., Oda, M., Kitasaka, T., Mori, K.: Class-wise confidence-aware active learning for laparoscopic images segmentation. Inter. J. Comput. Assisted Radiol. Surgery, 1–10 (2022)
Google Scholar
Sahu, M., Mukhopadhyay, A., Zachow, S.: Simulation-to-real domain adaptation with teacher-student learning for endoscopic instrument segmentation. Int. J. Comput. Assist. Radiol. Surg. 16(5), 849–859 (2021)
Article Google Scholar
Sahu, M., Strömsdörfer, R., Mukhopadhyay, A., Zachow, S.: Endo-Sim2Real: consistency learning-based domain adaptation for instrument segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 784–794. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_75
Chapter Google Scholar
Silva, B., et al.: Analysis of current deep learning networks for semantic segmentation of anatomical structures in laparoscopic surgery. In: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 3502–3505. IEEE (2022)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Tranheden, W., Olsson, V., Pinto, J., Svensson, L.: DACS: domain adaptation via cross-domain mixed sampling. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1379–1389 (2021)
Google Scholar
Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., De Mathelin, M., Padoy, N.: Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2016)
Article Google Scholar
Way, L.W., et al.: Causes and prevention of laparoscopic bile duct injuries: analysis of 252 cases from a human factors and cognitive psychology perspective. Ann. Surg. 237(4), 460–469 (2003)
Article Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)
Google Scholar
Yang, Y., Soatto, S.: FDA: fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4085–4095. IEEE (2020)
Google Scholar
Zakazov, I., Shaposhnikov, V., Bespalov, I., Dylov, D.V.: Feather-light fourier domain adaptation in magnetic resonance imaging. In: Kamnitsas, K., et al. (eds.) DART 2022. LNCS, vol. 13542, pp. 88–97. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16852-9_9
Zhou, Q., et al.: Context-aware mixup for domain adaptive semantic segmentation. IEEE Trans. Circuits Syst. Video Technol. 33, 804–817 (2021)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the JSPS KAKENHI Grant Numbers 17H00867, 21K19898, 26108006; in part by the JST CREST Grant Number JPMJCR20D5; and in part by the fellowship of the Nagoya University TMI WISE program from MEXT.

Author information

Authors and Affiliations

Graduate School of Informatics, Nagoya University, Nagoya, Japan
Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda & Kensaku Mori
Information Strategy Office, Information and Communications, Nagoya University, Nagoya, Japan
Masahiro Oda
Department of Information Science, Aichi Institute of Technology, Toyota, Japan
Takayuki Kitasaka
Information Technology Center, Nagoya University, Nagoya, Japan
Kensaku Mori
Research Center for Medical Bigdata, National Institute of Informatics, Tokyo, Japan
Kensaku Mori

Authors

Xinkai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuichiro Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Oda
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Kitasaka
View author publications
You can also search for this author in PubMed Google Scholar
Kensaku Mori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xinkai Zhao or Kensaku Mori .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen's University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 336 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, X., Hayashi, Y., Oda, M., Kitasaka, T., Mori, K. (2023). Masked Frequency Consistency for Domain-Adaptive Semantic Segmentation of Laparoscopic Images. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14220. Springer, Cham. https://doi.org/10.1007/978-3-031-43907-0_63

Download citation

DOI: https://doi.org/10.1007/978-3-031-43907-0_63
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43906-3
Online ISBN: 978-3-031-43907-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Masked Frequency Consistency for Domain-Adaptive Semantic Segmentation of Laparoscopic Images