Abstract
Colonoscopy is a procedure used to examine the colon and rectum for colorectal cancer or other abnormalities including polyps or diverticula. Apart from the actual diagnosis, manually processing the snapshots taken during the colonoscopy procedure (for medical record keeping) consumes a large amount of the clinician’s time. This can be automated through post-procedural machine learning based algorithms which classify anatomical landmarks in the colon. In this work, we have developed a pipeline for training vision-transformers for identifying anatomical landmarks, including appendiceal orifice, ileocecal valve/cecum landmark and rectum retroflection. To increase the accuracy of the model, we utilize a hybrid approach that combines algorithm-level and data-level techniques. We introduce a consistency loss to enhance model immunity to label inconsistencies, as well as a semantic non-landmark sampling technique aimed at increasing focus on colonic findings. For training and testing our pipeline, we have annotated 307 colonoscopy videos and 2363 snapshots with the assistance of several medical experts for enhanced reliability. The algorithm identifies landmarks with an accuracy of 92% on the test dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adewole, S., et al.: Deep learning methods for anatomical landmark detection in video capsule endoscopy images. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FTC 2020. AISC, vol. 1288, pp. 426–434. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63128-4_32
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. In: Advances in Neural Information Processing Systems 32 (2019)
Cao, Y., Liu, D., Tavanapong, W., Wong, J., Oh, J., De Groen, P.C.: Automatic classification of images with appendiceal orifice in colonoscopy videos. In: 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 2349–2352. IEEE (2006)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Che, K., et al.: Deep learning-based biological anatomical landmark detection in colonoscopy videos. arXiv preprint arXiv:2108.02948 (2021)
Chowdhury, A.S., Yao, J., VanUitert, R., Linguraru, M.G., Summers, R.M.: Detection of anatomical landmarks in human colon from computed tomographic colonography images. In: 2008 19th International Conference on Pattern Recognition, pp. 1–4. IEEE (2008)
Cooper, J.A., Ryan, R., Parsons, N., Stinton, C., Marshall, T., Taylor-Phillips, S.: The use of electronic healthcare records for colorectal cancer screening referral decisions and risk prediction model development. BMC Gastroenterol. 20(1), 1–16 (2020)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dosovitskiy, A., et al.: an image is worth 16\(\,\times \,\)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Estabrooks, A., Japkowicz, N.: A mixture-of-experts framework for learning from imbalanced data sets. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds.) IDA 2001. LNCS, vol. 2189, pp. 34–43. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44816-0_4
Fan, Y., Kukleva, A., Dai, D., Schiele, B.: Revisiting consistency regularization for semi-supervised learning. Int. J. Comput. Vis. 131, 1–18 (2022). https://doi.org/10.1007/s11263-022-01723-4
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020)
Ghesu, F.C., Georgescu, B., Mansi, T., Neumann, D., Hornegger, J., Comaniciu, D.: An artificial agent for anatomical landmark detection in medical images. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9902, pp. 229–237. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46726-9_27
Jo, T., Japkowicz, N.: Class imbalances versus small disjuncts. ACM SIGKDD Explor. Newsl. 6(1), 40–49 (2004)
Johnson, J.M., Khoshgoftaar, T.M.: Survey on deep learning with class imbalance. J. Big Data 6(1), 1–54 (2019)
Katzir, L., et al.: Estimating withdrawal time in colonoscopies. In: Computer Vision-ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pp. 495–512. Springer (2023). https://doi.org/10.1007/978-3-031-25066-8_28
Mamonov, A.V., Figueiredo, I.N., Figueiredo, P.N., Tsai, Y.H.R.: Automated polyp detection in colon capsule endoscopy. IEEE Trans. Med. Imaging 33(7), 1488–1502 (2014)
McDonald, C.J., Callaghan, F.M., Weissman, A., Goodwin, R.M., Mundkur, M., Kuhn, T.: Use of internist’s free time by ambulatory care electronic medical record systems. JAMA Intern. Med. 174(11), 1860–1863 (2014)
Morelli, M.S., Miller, J.S., Imperiale, T.F.: Colonoscopy performance in a large private practice: a comparison to quality benchmarks. J. Clin. Gastroenterol. 44(2), 152–153 (2010)
Morgan, E., et al.: Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut 72(2), 338–344 (2023)
Mullick, S.S., Datta, S., Das, S.: Generative adversarial minority oversampling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1695–1704 (2019)
Park, S.Y., Sargent, D., Spofford, I., Vosburgh, K.G., Yousif, A., et al.: A colon video analysis framework for polyp detection. IEEE Trans. Biomed. Eng. 59(5), 1408–1418 (2012)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Qadir, H.A., Shin, Y., Solhusvik, J., Bergsland, J., Aabakken, L., Balasingham, I.: Toward real-time polyp detection using fully CNNs for 2D gaussian shapes prediction. Med. Image Anal. 68, 101897 (2021)
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tamhane, A., Mida, T., Posner, E., Bouhnik, M.: Colonoscopy landmark detection using vision transformers. In: Imaging Systems for GI Endoscopy, and Graphs in Biomedical Image Analysis: First MICCAI Workshop, ISGIE 2022, and Fourth MICCAI Workshop, GRAIL 2022, Held in Conjunction with MICCAI 2022, Singapore, September 18, 2022, Proceedings, pp. 24–34. Springer (2022). https://doi.org/10.1007/978-3-031-21083-9_3
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems 30 (2017)
Vuttipittayamongkol, P., Elyan, E.: Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Inf. Sci. 509, 47–70 (2020)
Zhang, J., et al.: Colonoscopic screening is associated with reduced colorectal cancer incidence and mortality: a systematic review and meta-analysis. J. Cancer 11(20), 5953 (2020)
Zhou, S.K., et al.: A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. In: Proceedings of the IEEE (2021)
Zhou, S.K., Xu, Z.: Landmark detection and multiorgan segmentation: representations and supervised approaches. In: Handbook of Medical Image Computing and Computer Assisted Intervention, pp. 205–229. Elsevier (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tamhane, A., Dobkin, D., Shtalrid, O., Bouhnik, M., Posner, E., Mida, T. (2024). Consistency Loss for Improved Colonoscopy Landmark Detection with Vision Transformers. In: Cao, X., Xu, X., Rekik, I., Cui, Z., Ouyang, X. (eds) Machine Learning in Medical Imaging. MLMI 2023. Lecture Notes in Computer Science, vol 14349. Springer, Cham. https://doi.org/10.1007/978-3-031-45676-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-45676-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45675-6
Online ISBN: 978-3-031-45676-3
eBook Packages: Computer ScienceComputer Science (R0)