Novel benchmark database of digitized and calibrated cervical cells for artificial intelligence based screening of cervical cancer

Sarwar, Abid; Suri, Jyotsna; Ali, Mehbob; Sharma, Vinod

doi:10.1007/s12652-016-0353-8

Novel benchmark database of digitized and calibrated cervical cells for artificial intelligence based screening of cervical cancer

Original Research
Published: 10 March 2016

Volume 7, pages 593–606, (2016)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abid Sarwar¹,
Jyotsna Suri²,
Mehbob Ali¹ &
…
Vinod Sharma¹

864 Accesses
11 Citations
Explore all metrics

Abstract

The primary objective of this research work is to develop a novel benchmark database of digitized and calibrated, cervical cells obtained from slides of Papanicolaou smear test, which is done for screening of cervical cancer. This database can serve as a potential tool for designing, developing, training, testing and validating various artificial intelligence based systems for prognosis of cervical cancer by characterization and classification of Papanicolaou smear images. The database can also be used by other researchers for comparative analysis of working efficiencies of various machine learning and image processing algorithms. The database can be obtained by sending a request to the corresponding author or can be downloaded from http://digitalpapsmeardb.in/. Besides developing a rich machine learning database we have also presented a novel artificial intelligence based hybrid ensemble technique for efficient screening of cervical cancer by automated analysis of Papanicolaou smear images. The correct and timely diagnosis of cervical cancer is one of the major problems in the medical world. From the literature it has been found that different pattern recognition techniques can help them to improve in this domain. Papanicolaou smear (also referred to as Pap smear) is a microscopic examination of samples of human cells scraped from the lower, narrow part of the uterus, called cervix. A sample of cells after being stained by using Papanicolaou method is analyzed under microscope for the presence of any unusual developments indicating any precancerous and potentially precancerous developments. Abnormal findings, if observed are subjected to further precise diagnostic subroutines. Examining the cell images for abnormalities in the cervix provides grounds for provision of prompt action and thus reducing incidence and deaths from cervical cancer. It is the most popular technique used for screening of cervical cancer. Pap smear test, if done with a regular screening programs and proper follow-up, can reduce cervical cancer mortality by up to 80 % (Arbyn et al. Ann Oncol 21:48–458, 2010). The contribution of this paper is that we have created a rich machine learning database of quantitatively profiled and calibrated cervical cells obtained from Pap-smear test slides. The database so created consists of data of about 200 clinical cases (8091 cervical cells), which have been obtained from multiple health care centers, so as to ensure diversity in data. The Pap-smear slides were processed using a multi-headed digital microscope and images of cervical cells were obtained, which were then passed through various data preprocessing subroutines. After preprocessing the cells were morphologically profiled and scaled to obtain separate quantitative measurements of various features of cytoplasm and nucleus respectively. The cells in the database were carefully classified in different corresponding classes according to the latest 2001-Bethesda system of classification, by multiple cytotechnicians and histopathologists. In addition to this, we have also pioneered to apply a novel hybrid ensemble system to this database in order to evaluate the effectiveness of both novel database and novel hybrid ensemble technique to screen cervical cancer by categorization of Pap smear data. The paper also presents a comparative analysis of multiple artificial intelligence based classification algorithms for prognosis of cervical cancer. For evaluating the effectiveness and correctness of the digital database prepared in this work, authors implemented this database for training, testing and validating fifteen different artificial intelligence based machine learning algorithms. All the algorithms trained with this database presented commendable efficiency in screening of cervical cancer. For two-class problem all the algorithms trained with the digital database showed the efficiencies in the range of about 93 to 95 % while as in case of multi class problem algorithms expressed the efficiencies in the range of about 69 to 78 %. The results indicate that the novel digital database prepared in this work can be efficiently used for developing new machine learning based techniques for automated screening of cervical cancer. The results also indicate that hybrid ensemble technique is an efficient method for classification of pap-smear images and hence can be effectively used for diagnosis of cervical cancer. Among all the algorithms implemented, the hybrid ensemble approach outperformed and expressed an efficiency of about 98 % for 2-class problem and about 86 % for 7-class problem. The results when compared with the all the standalone classifiers were significantly better for both two-class and multi-class problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial neural network based screening of cervical cancer using a hierarchical modular neural network architecture (HMNNA) and novel benchmark uterine cervix cancer database

Article 01 November 2017

Artificial intelligence-assisted cervical dysplasia detection using papanicolaou smear images

Article 12 April 2022

Cric searchable image database as a public platform for conventional pap smear cytology data

Article Open access 10 June 2021

References

Anagnostou T, Remzi M, Lykourinas M, Djavan B (2003) Artificial neural networks for decision-making in Urologic Oncology. Eur Urol 43(6):596–603
Apgar BS, Zoschnick L, Wright TC (2003) The 2001 Bethesda system terminology. Am Fam Phys 6810:1992–1998
Arbyn M et al (2010) European guidelines for quality assurance in cervical cancer screening. Second edition—summary document. Ann Oncol 21(3):448–458
Bamford P, Lovell B (1996) A water immersion algorithm for cytological image segmentation. In: Proc. APRS image segmentation workshop, Sydney, pp 75–79
Bratko I, Kononenko I (1987) Learning rules from incomplete and noisy data, Interactions in Artificial Intelligence and Statistical Methods, Hampshire. Technical Press, Hampshire
Google Scholar
Braumann UD, Kuska JP, Einenkel J, Horn LC, Lof-fler M, Hockel M (2005) Three-dimensional reconstruction and quantification of cervical carcinoma invasion fronts from histological serial sections. Med Imaging IEEE Trans 24(10):1286–1307
Article Google Scholar
Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Proceedings of European working session on learning-91, Portugal, March 4–6, pp 164–178
Catto JWF, Linkens DA, Abbod MF, Chen M, Burton J, Feeley KM, Hamdy FC (2003) Artificial intelligence in predicting bladder cancer outcome: a comparison of neuro-fuzzy modeling and artificial neural networks. Clin Cancer Res 9:4172
Chang Chun-Lang, Hsu Ming-Yuan (2009) The study that applies artificial intelligence and logistic regression for assistance in differential diagnostic of pancreatic cancer. Expert Syst Appl 36(7):10663–10672
Article Google Scholar
Chaturvedi A, Gillison ML (2010) Human Papillomavirus and head and neck cancer. In: Epidemiology, pathogenesis, and prevention of head and neck cancer, pp 87–116
Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: Proceedings of European working session on learning-91, Portugal, pp 151–163
Das L, Sarkar T, Maiti AK, Naskar S, Das S, Chatterjee J (2014) Integrated cervical smear screening using liquid based cytology and bioimpedance analysis. J Cytol 31(4):183–188
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning, multiple classifier systems volume 1857 of the series Lecture Notes in Computer Science, pp 1–15
Genc-tav Aslı et al (2012) Unsupervised segmentation and classification of cervical cell images. Pattern Recogn 45:4151–4168
Article Google Scholar
GLOBOCAN 2002 database: summary table by cancer. Archived from the original on 16 June 2008
Hojker S, Kononenko I, Juka A, Fidler V, Porenta M (1988) Expert system’s development in management of thyroid disease. In: Proc. European congress for nuclear medicine, Milano
Horn KA, Compton P, Lazarus L, Quinlan JR (1985) An expert system for interpretation of thyroid assays in clinical laboratory. Aust Comput J 17(1):7–11
Google Scholar
Jantzen J, Norup J, Dounias G, Bjerregaard B (2005) Pap-smear benchmark data for pattern classification. Nat Inspir Smart Inf Syst (NiSIS) 1–9
Katsis Christos D et al (2011) An integrated system based on physiological signals for the assessment of affective states in patients with anxiety disorders 6(3):261–268
Google Scholar
Kent A (2010) HPV vaccination and testing. Rev Obstet Gynecol 3(1):33–34. PMC 2876324. PMID 20508781
Kern J, Dezelic G, Tezak-Bencic M, Durrigl T (1990) Medical decision making using inductive learning program. In: Proceedings of 1st congress on Yougoslav medical informatics, Beogard, Dec 6–8, pp 221–228
Lasota T et al (2009) Intelligent data engineering and automated learning—IDEAL 2009, volume 5788 of the series Lecture Notes in Computer Science, pp 554–561
Lesmo L, Saitta L, Torasso P (1983) Fuzzy production rules: a learning methodology. Adv Fuzzy Sets Possibility Theory Appl 181–198
Lezoray O, Cardot H (2002) Cooperation of color pixel classification schemes and color watershed: a study for microscopic images. IEEE Trans Image Process 11(7):783–789
Article Google Scholar
Lisboa PJ, Taktak AF (2006) The use of artificial neural networks in decision support in cancer: a systematic review. Elsevier Neural Networks 19:408–415
Article MATH Google Scholar
Mat-Isa NA, Mashor MY, Othman NH (2008) An automated cervical pre-cancerous diagnostic system. Artif Intell Med 42(1):1–11
Metzler V, Lehmann T, Bienert H, Mottaghy K, Spitzer K (1999) A novel method for quantifying shape deformation applied to biocompatibility testing. ASAIO J 45(4):264–271
Article Google Scholar
Meyer-Arendt JR, Humphreys DM (1972) Quantitative morphology of cancer cells. Acta Histochem 44(1):41–48
Google Scholar
Muñoz N et al (2003) Epidemiologic classification of human papillomavirus types associated with cervical cancer. N Engl J Med 348:518–527. doi:10.1056/NEJMoa021641
Article Google Scholar
Musavi MT et al (1992) On the training of radial basis function classifiers. Neural Netw 5(4):595–603
Article Google Scholar
Nandakumar V, Kelbauskas L, Johnson R, Meldrum D (2011) Quantitative characterization of preneoplastic progression using single-cell computed tomography and three dimensional karyometry. Cytometry Part A 79(1):25–34
Article Google Scholar
Nunez M (1990) Decision tree induction using domain knowledge, current trends in knowledge acquisition. IOS Press, Amsterdam
Google Scholar
Regnier-Coudert O et al (2012) Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers. Artif Intell Med 55(1):25–35
Article Google Scholar
Rokach Lior (2010) Ensemble-based classifiers. Artif Intell Rev 33(1):1–39
Article MathSciNet Google Scholar
Sarwar A, Sharma V (2013) Comparative analysis of machine learning techniques in prognosis of type II diabetes. AI & Society, Springer Verlag, London
Google Scholar
Sarwar A, Sharma V, Gupta R (2015) Hybrid ensemble learning technique for screening of cervical cancer using Papanicolaou smear image analysis. Personalized Medicine Universe 4:54–62. doi:10.1016/j.pmu.2014.10.001
Article Google Scholar
Saslow D et al (2012) American Cancer Society, American Society for Colposcopy and Cervical Pathology, and American Society for Clinical Pathology Screening Guidelines for the Prevention and Early Detection of Cervical Cancer. CA Cancer J Clin 62(3):147–172. doi:10.3322/caac.21139
Article Google Scholar
Shidham Vinod B et al (2011) p16 INK4a immunocytochemistry on cell blocks as an adjunct to cervical cytology: potential reflex testing on specially prepared cell blocks from residual liquid-based cytology specimens. CytoJournal 8:1
Article Google Scholar
Sokouti Babak et al (2012) A framework for diagnosing cervical cancer disease based on feedforward MLP neural network and ThinPrep histopathological cell image features. Neural Comput Appl. doi:10.1007/s00521-012-1220-y
Google Scholar
World Health Organization (2006) Fact sheet no. 297: Cancer. Retrieved 01 Dec 2007

Download references

Author information

Authors and Affiliations

Department of Computer Science and IT, University of Jammu, Jammu, 180006, J&k, India
Abid Sarwar, Mehbob Ali & Vinod Sharma
Department of Pathology, Govt. Medical College, Jammu, 180001, J&k, India
Jyotsna Suri

Authors

Abid Sarwar
View author publications
You can also search for this author in PubMed Google Scholar
Jyotsna Suri
View author publications
You can also search for this author in PubMed Google Scholar
Mehbob Ali
View author publications
You can also search for this author in PubMed Google Scholar
Vinod Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abid Sarwar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sarwar, A., Suri, J., Ali, M. et al. Novel benchmark database of digitized and calibrated cervical cells for artificial intelligence based screening of cervical cancer. J Ambient Intell Human Comput 7, 593–606 (2016). https://doi.org/10.1007/s12652-016-0353-8

Download citation

Received: 15 October 2015
Accepted: 11 February 2016
Published: 10 March 2016
Issue Date: August 2016
DOI: https://doi.org/10.1007/s12652-016-0353-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Novel benchmark database of digitized and calibrated cervical cells for artificial intelligence based screening of cervical cancer

Abstract

Access this article

Similar content being viewed by others

Artificial neural network based screening of cervical cancer using a hierarchical modular neural network architecture (HMNNA) and novel benchmark uterine cervix cancer database

Artificial intelligence-assisted cervical dysplasia detection using papanicolaou smear images

Cric searchable image database as a public platform for conventional pap smear cytology data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Novel benchmark database of digitized and calibrated cervical cells for artificial intelligence based screening of cervical cancer

Abstract

Access this article

Similar content being viewed by others

Artificial neural network based screening of cervical cancer using a hierarchical modular neural network architecture (HMNNA) and novel benchmark uterine cervix cancer database

Artificial intelligence-assisted cervical dysplasia detection using papanicolaou smear images

Cric searchable image database as a public platform for conventional pap smear cytology data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation