Skip to main content

Advertisement

Log in

Public Imaging Datasets of Gastrointestinal Endoscopy for Artificial Intelligence: a Review

  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

With the advances in endoscopic technologies and artificial intelligence, a large number of endoscopic imaging datasets have been made public to researchers around the world. This study aims to review and introduce these datasets. An extensive literature search was conducted to identify appropriate datasets in PubMed, and other targeted searches were conducted in GitHub, Kaggle, and Simula to identify datasets directly. We provided a brief introduction to each dataset and evaluated the characteristics of the datasets included. Moreover, two national datasets in progress were discussed. A total of 40 datasets of endoscopic images were included, of which 34 were accessible for use. Basic and detailed information on each dataset was reported. Of all the datasets, 16 focus on polyps, and 6 focus on small bowel lesions. Most datasets (n = 16) were constructed by colonoscopy only, followed by normal gastrointestinal endoscopy and capsule endoscopy (n = 9). This review may facilitate the usage of public dataset resources in endoscopic research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data Availability

The endoscopic imaging data supporting the findings of the review are available within the article. The websites of available datasets are provided in Table 1.

References

  1. Nishiyama S, et al.: Clinical usefulness of endocytoscopy in the remission stage of ulcerative colitis: a pilot study. J Gastroenterol 50:1087-1093, 2015

    PubMed  Google Scholar 

  2. Corley DA, Levin TR, Doubeni CA: Adenoma detection rate and risk of colorectal cancer and death. N Engl J Med 370:2541, 2014. https://doi.org/10.1056/NEJMc1405329

    Article  PubMed  Google Scholar 

  3. Telford JJ, Enns RA: Endoscopic missed rates of upper gastrointestinal cancers: parallels with colonoscopy. Am J Gastroenterol 105:1298-1300, 2010

    PubMed  Google Scholar 

  4. Iddan G, Meron G, Glukhovsky A, Swain P: Wireless capsule endoscopy. Nature 405:417, 2000. https://doi.org/10.1038/35013140

    Article  CAS  PubMed  Google Scholar 

  5. McAlindon ME, Ching HL, Yung D, Sidhu R, Koulaouzidis A: Capsule endoscopy of the small bowel. Ann Transl Med 4:369, 2016. https://doi.org/10.21037/atm.2016.09.18

  6. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K: The practical implementation of artificial intelligence technologies in medicine. Nat Med 25:30-36, 2019

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Bernal J, Sánchez J, Vilariño F: Towards automatic polyp detection with a polyp appearance model. Pattern Recognition 45:3166-3182, 2012

    Google Scholar 

  8. Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F: WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111, 2015

  9. Silva J, Histace A, Romain O, Dray X, Granado B: Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9:283-293, 2014

    PubMed  Google Scholar 

  10. Tajbakhsh N, Gurudu SR, Liang J: Automated Polyp Detection in Colonoscopy Videos Using Shape and Context Information. IEEE Trans Med Imaging 35:630-644, 2016

    PubMed  Google Scholar 

  11. Mesejo P, et al.: Computer-Aided Classification of Gastrointestinal Lesions in Regular Colonoscopy. IEEE Trans Med Imaging 35:2051-2063, 2016

    PubMed  Google Scholar 

  12. Vázquez D, et al.: A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images. J Healthc Eng 2017:4037190, 2017. https://doi.org/10.1155/2017/4037190

    Article  PubMed  PubMed Central  Google Scholar 

  13. Jha D, Smedsrud PH, Riegler MA et al.: Kvasir-seg: A segmented polyp dataset. In: International Conference on MultiMedia Modeling (MMM), pp 451–462, 2020. https://doi.org/10.1007/978-3-030-37734-2_37

  14. Figueiredo I, Pinto L, Figueiredo P, Tsai R: Unsupervised segmentation of colonic polyps in narrow-band imaging data based on manifold representation of images and Wasserstein distance. Biomedical Signal Processing and Control 53:101577, 2019. https://doi.org/10.1016/j.bspc.2019.101577

  15. Figueiredo P, Figueiredo I, Pinto L, Kumar S, Tsai R, Mamonov A: Polyp detection with computer-aided diagnosis in white light colonoscopy: comparison of three different methods. Endoscopy International Open 07:E209-E215, 2019

    Google Scholar 

  16. Patel K, et al.: A comparative study on polyp classification using convolutional neural networks. PLoS One 15:e0236452, 2020. https://doi.org/10.1371/journal.pone.0236452

  17. Misawa M, et al.: Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video). Gastrointest Endosc 93:960-967.e963, 2021

    PubMed  Google Scholar 

  18. Sanchez-Peralta LF, et al.: PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets. Applied Sciences 10:8501, 2020. https://doi.org/10.3390/app10238501

    Article  CAS  Google Scholar 

  19. Wang W, Tian J, Zhang C, Luo Y, Wang X, Li J: An improved deep learning approach and its applications on colonic polyp images detection. BMC Med Imaging 20:83, 2020. https://doi.org/10.1186/s12880-020-00482-3

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ma Y, Chen X, Cheng K, Li Y, Sun B: LDPolypVideo Benchmark: A Large-Scale Colonoscopy Video Dataset of Diverse Polyps. In: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp 387–396, 2021. https://doi.org/10.1007/978-3-030-87240-3_37

  21. Ji GP, et al.: Video Polyp Segmentation: A Deep Learning Perspective. Machine Intelligence Research 19:1-19, 2022

    Google Scholar 

  22. Ali S, et al.: A multi-centre polyp detection and segmentation dataset for generalisability assessment. Sci Data 10:75, 2022

    Google Scholar 

  23. Koulaouzidis A, et al.: KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes. Endosc Int Open 5:E477-e483, 2017

    PubMed  PubMed Central  Google Scholar 

  24. Leenhardt R, et al.: CAD-CAP: a 25,000-image database serving the development of artificial intelligence for capsule endoscopy. Endosc Int Open 8:E415-e420, 2020

    PubMed  PubMed Central  Google Scholar 

  25. Smedsrud PH, et al.: Kvasir-Capsule, a video capsule endoscopy dataset. Sci Data 8:142, 2021. https://doi.org/10.1038/s41597-021-00920-z

    Article  PubMed  PubMed Central  Google Scholar 

  26. Kong Z, et al.: Multi-Task Classification and Segmentation for Explicable Capsule Endoscopy Diagnostics. Front Mol Biosci 8:614277, 2021. https://doi.org/10.3389/fmolb.2021.614277

  27. de Maissin A, et al.: Multi-expert annotation of Crohn's disease images of the small bowel for automatic detection using a convolutional recurrent attention neural network. Endosc Int Open 9:E1136-e1144, 2021

    PubMed  PubMed Central  Google Scholar 

  28. García-Peraza-Herrera LC, et al.: Intrapapillary capillary loop classification in magnification endoscopy: open dataset and baseline methodology. Int J Comput Assist Radiol Surg 15:651-659, 2020

    PubMed  PubMed Central  Google Scholar 

  29. Yang J, et al.: A benchmark dataset of endoscopic images and novel deep learning method to detect intestinal metaplasia and gastritis atrophy. IEEE Journal of Biomedical and Health Informatics 27:7-16, 2023

    PubMed  Google Scholar 

  30. Pogorelov K, Randel KR, Griwodz C, Lange TD, Halvorsen P: KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection. In: the 8th Acm on Multimedia Systems Conference, pp 164–169, 2017. https://doi.org/10.1145/3083187.3083212

  31. Borgli H, et al.: HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci Data 7:283, 2020. https://doi.org/10.1038/s41597-020-00622-y

    Article  PubMed  PubMed Central  Google Scholar 

  32. Charoen A, et al.: Rhode Island gastroenterology video capsule endoscopy data set. Sci Data 9:602, 2022. https://doi.org/10.1038/s41597-022-01726-3

    Article  PubMed  PubMed Central  Google Scholar 

  33. Montalbo F: Diagnosing gastrointestinal diseases from endoscopy images through a multi-fused CNN with auxiliary layers, alpha dropouts, and a fusion residual block. Biomedical signal processing and control 76:103683, 2022. https://doi.org/10.1016/j.bspc.2022.103683

  34. Cychnerski J, Dziubich T, Brzeski A: ERS: a novel comprehensive endoscopy image dataset for machine learning, compliant with the MST 3.0 specification. arXiv e-prints, 2022. https://doi.org/10.48550/arXiv.2201.08746

  35. Gastrolab. Available at: http://www.gastrolab.net/index.htm

  36. WEO Clinical Endoscopy Atlas. Available at: http://www.endoatlas.org/index.php

  37. Atlas of Gastrointestinal Endoscopy. Available at: http://www.endoatlas.com/atlas_1.html.

  38. EI salvador atlas. Available at: http://www.gastrointestinalatlas.com/index.html.

  39. Gastrointestinal Image Analysis (GIANA) Angiodysplasia D&L challenge. [Online] https://endovissub2017-giana.grand-challenge.org/home/. Accessed 20 Nov 2017

  40. Pogorelov K, et al.: Nerthus: A Bowel Preparation Quality Video Dataset. In: the 8th Acm on Multimedia Systems Conference, pp 170–174, 2017. https://doi.org/10.1145/3083187.3083216

  41. Angermann Q, et al.: Towards Real-Time Polyp Detection in Colonoscopy Videos: Adapting Still Frame-Based Methodologies for Video Sequences Analysis. In: Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures, pp 29–41, 2017. https://doi.org/10.1007/978-3-319-67543-5_3

  42. Endoscopy Artefact Detection (EAD) Dataset. [Online] https://doi.org/10.17632/c7fjbxcgj9.2. Accessed 30 Aug 2019

  43. Cho M, Kim JH, Hong KS, Kim JS, Kong HJ, Kim S: Identification of cecum time-location in a colonoscopy video by deep learning analysis of colonoscope movement. PeerJ 7:e7256, 2019. https://doi.org/10.7717/peerj.7256

  44. Endoscopy Disease Detection and Segmentation (EDD2020). [Online] https://edd2020.grand-challenge.org/Home/

  45. Jha D, et al.: Kvasir-Instrument: Diagnostic and Therapeutic Tool Segmentation Dataset in Gastrointestinal Endoscopy. In: International Conference on MultiMedia Modeling (MMM), pp 218–229, 2020. https://doi.org/10.1007/978-3-030-67835-7_19

  46. Bae S-H, Yoon K-J: Polyp Detection via Imbalanced Learning and Discriminative Feature Learning. IEEE transactions on medical imaging 34, 2015. https://doi.org/10.1109/TMI.2015.2434398

  47. Bernal J, Sanchez J, Vilariño F: Impact of image preprocessing methods on polyp localization in colonoscopy frames. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Conference, pp 7350–7354, 2013. https://doi.org/10.1109/EMBC.2013.6611256

  48. Tajbakhsh N, Gurudu S, Liang J: A Classification-Enhanced Vote Accumulation Scheme for Detecting Colonic Polyps. Computation and Clinical Applications 8198:53-62, 2013

    Google Scholar 

  49. Inoue H KH, et al: The Paris endoscopic classification of superficial neoplastic lesions: esophagus, stomach, and colon: November 30 to December 1, 2002. Gastrointest Endosc 58:S3-43, 2003

    Google Scholar 

  50. Enns RA, et al.: Clinical Practice Guidelines for the Use of Video Capsule Endoscopy. Gastroenterology 152:497-514, 2017

    PubMed  Google Scholar 

  51. Hale M, McAlindon ME: Capsule endoscopy as a panenteric diagnostic tool. Br J Surg 101:148-149, 2014

    CAS  PubMed  Google Scholar 

  52. Everson M, et al.: Artificial intelligence for the real-time classification of intrapapillary capillary loop patterns in the endoscopic diagnosis of early oesophageal squamous cell carcinoma: A proof-of-concept study. United European Gastroenterol J 7:297-306, 2019

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Nishihara R, et al.: Long-term colorectal-cancer incidence and mortality after lower endoscopy. N Engl J Med 369:1095-1105, 2013

    CAS  PubMed  Google Scholar 

  54. Norwood DA, Montalvan EE, Dominguez RL, Morgan DR: Gastric Cancer: Emerging Trends in Prevention, Diagnosis, and Treatment. Gastroenterol Clin North Am 51:501-518, 2022

    PubMed  Google Scholar 

  55. Riegler M, et al.: Multimedia for Medicine: The Medico Task at MediaEval. In: MediaEval Benchmarking Initiative for Multimedia Evaluation 2017, pp 13–15, 2017

  56. Pogorelov K, et al.: Medico Multimedia Task at MediaEval 2018. In: MediaEval 2018, pp 29–31, 2018

  57. Chang YY, et al.: Development and validation of a deep learning-based algorithm for colonoscopy quality assessment. Surg Endosc 36:6446-6455, 2022

    PubMed  Google Scholar 

  58. Das D, Lee CSG: A Two-Stage Approach to Few-Shot Learning for Image Recognition. IEEE Trans Image Process 29:3336-3350, 2020

    Google Scholar 

  59. Calderwood AH, Jacobson BC: Comprehensive validation of the Boston Bowel Preparation Scale. Gastrointest Endosc 72:686-692, 2010

    PubMed  PubMed Central  Google Scholar 

  60. Lai EJ, Calderwood AH, Doros G, Fix OK, Jacobson BC: The Boston bowel preparation scale: a valid and reliable instrument for colonoscopy-oriented research. Gastrointest Endosc 69:620-625, 2009

    PubMed  PubMed Central  Google Scholar 

  61. Yang CB, Kim SH, Lim YJ: Preparation of image databases for artificial intelligence algorithm development in gastrointestinal endoscopy. Clin Endosc 55:594-604, 2022

    PubMed  PubMed Central  Google Scholar 

  62. Tanaka K: Japan Endoscopy Database project. Dig Endosc 34 Suppl 2:20-22, 2022

    PubMed  Google Scholar 

  63. Lee TJ, et al.: Development of a national automated endoscopy database: The United Kingdom National Endoscopy Database (NED). United European Gastroenterol J 7:798-806, 2019

    PubMed  PubMed Central  Google Scholar 

  64. Matsuda K, et al.: Design paper: Japan Endoscopy Database (JED): A prospective, large database project related to gastroenterological endoscopy in Japan. Dig Endosc 30:5-19, 2018

    PubMed  Google Scholar 

  65. Kodashima S, et al.: First progress report on the Japan Endoscopy Database project. Dig Endosc 30:20-28, 2018

    PubMed  Google Scholar 

  66. Oda I, Hoteya S, Fujishiro M: Status of Helicobacter pylori infection and gastric mucosal atrophy in patients with gastric cancer: Analysis based on the Japan Endoscopy Database. Dig Endosc 31:103, 2019. https://doi.org/10.1111/den.13287

    Article  PubMed  Google Scholar 

  67. Saito Y, et al.: Current status of diagnostic and therapeutic colonoscopy in Japan: The Japan Endoscopic Database Project. Dig Endosc 34:144-152, 2022

    PubMed  Google Scholar 

  68. Rutter MD, Brookes M, Lee TJ, Rogers P, Sharp L: Impact of the COVID-19 pandemic on UK endoscopic activity and cancer detection: a National Endoscopy Database Analysis. Gut 70:537-543, 2021

    CAS  PubMed  Google Scholar 

  69. Hann A, Troya J, Fitting D: Current status and limitations of artificial intelligence in colonoscopy. United European Gastroenterol J 9:527-533, 2021

    PubMed  PubMed Central  Google Scholar 

  70. Nogueira-Rodríguez A, et al.: Deep Neural Networks approaches for detecting and classifying colorectal polyps. Neurocomputing 423:721-734, 2021

    Google Scholar 

  71. Chetcuti Zammit S, Sidhu R: Capsule endoscopy - Recent developments and future directions. Expert Rev Gastroenterol Hepatol 15:127-137, 2021

    CAS  PubMed  Google Scholar 

  72. Houwen B, Nass KJ, Vleugels JLA, Fockens P, Hazewinkel Y, Dekker E: Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility, and usability. Gastrointest Endosc 97:184-199.e116, 2023

    PubMed  Google Scholar 

  73. Nogueira-Rodríguez A, Reboiro-Jato M, Glez-Peña D, López-Fernández H: Performance of Convolutional Neural Networks for Polyp Localization on Public Colonoscopy Image Datasets. Diagnostics (Basel) 12, 2022. https://doi.org/10.3390/diagnostics12040898

  74. Krause J, et al.: Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy. Ophthalmology 125:1264-1272, 2018

    PubMed  Google Scholar 

  75. Luo H, et al.: Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study. Lancet Oncol 20:1645-1654, 2019

    CAS  PubMed  Google Scholar 

  76. Zhou J, et al.: Application of artificial intelligence in gastrointestinal disease: a narrative review. Ann Transl Med 9:1188, 2021. https://doi.org/10.21037/atm-21-3001

  77. Arnold M, et al.: Global Burden of 5 Major Types of Gastrointestinal Cancer. Gastroenterology 159:335-349.e15, 2020

    PubMed  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (82000540), Science and Technology Plan of Suzhou City (SKY2021038), Suzhou Clinical Center of Digestive Diseases (Szlcyxzx202101), and Youth Program of Suzhou Health Committee (KJXW2019001).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Zhu JZ conception and design; Zhu SQ drafting of the article; Zhu SQ and Yin MY literature research; Gao JW and Lin JX data extraction; Xu C and Liu L quality assessment; Zhu JZ and Xu CF critical revision of the article; Xu CF and Zhu JZ final approval of the article.

Corresponding authors

Correspondence to Chunfang Xu or Jinzhou Zhu.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 52 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, S., Gao, J., Liu, L. et al. Public Imaging Datasets of Gastrointestinal Endoscopy for Artificial Intelligence: a Review. J Digit Imaging 36, 2578–2601 (2023). https://doi.org/10.1007/s10278-023-00844-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-023-00844-7

Keywords

Navigation