Abstract
Cervical precancer is a direct precursor to invasive cervical cancer and a prime target for ablative therapy. This paper presents an empirical study of Vision Transformers (ViT) for cervical precancer classification, an extended study of our previous work using data derived from two studies conducted by the U.S. National Cancer Institute. In this study, we show that ViT can significantly outperform the current state-of-art methods. We also examine data augmentation techniques that help reduce noise that can interfere in precancer detection, such as specular reflection. We achieve 84% accuracy on the test set outperforming the existing works based on the same dataset. Apart from the performance gains, we observe the learned features focus on cervical regions of anatomical significance. Through these experiments, we demonstrate that ViT attains excellent results compared to the current state-of-the-art methods in classifying cervical images for cervical precancer screening.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schiff, M., et al.: Seminar Human papillomavirus and cervical cancer. https://doi.org/10.1016/S0140-6736(07)61416-0
Belinson, J.L., Pretorius, R.G., Permanente, K., Xinfeng Qu, C.: Cervical screening by pap test and visual inspection enabling same-day biopsy in low-resource, high-risk communities (2019). http://journals.lww.com/greenjournal
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, December 2016, vol. 2016-December, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. http://code.google.com/p/cuda-convnet/. Accessed 25 Feb 2021
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. In: 36th International Conference on Machine Learning, ICML 2019, vol. 2019-June, pp. 10691–10700 (2019). http://arxiv.org/abs/1905.11946. Accessed 25 Feb 2021
Dosovitskiy, A., et al.: An image is worth 16×16 words: transformers for image recognition at scale. https://github.com/
Han, K., et al.: A survey on visual transformer (2022)
Hu, L., et al.: An observational study of deep learning and automated evaluation of cervical images for cancer screening. https://doi.org/10.1093/jnci/djy225
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. http://image-net.org/challenges/LSVRC/2015/results
Xue, Z., et al.: A demonstration of automated visual evaluation of cervical images taken with a smartphone camera. Int. J. Cancer 147(9), 2416–2423 (2020). https://doi.org/10.1002/ijc.33029
Guo, P., et al.: Clinical medicine network visualization and pyramidal feature comparison for ablative treatability classification using digitized cervix images (2021). https://doi.org/10.3390/jcm10050953
Yan, L., et al.: Multi-state colposcopy image fusion for cervical precancerous lesion diagnosis using BF-CNN. Biomed. Signal Process. Control 68(April), 102700 (2021). https://doi.org/10.1016/j.bspc.2021.102700
Angara, S., Guo, P., Xue, Z., Antani, S.: Semi-supervised learning for cervical precancer detection, pp. 202–206 (2021). https://doi.org/10.1109/CBMS52027.2021.00072
Guo, P., Xue, Z., Rodney Long, L., Antani, S.: Cross-dataset evaluation of deep learning networks for uterine cervix segmentation. Diagnostics 10(1), 44 (2020). https://doi.org/10.3390/diagnostics10010044
Schiffman, M., Adrianza, M.E.: ASCUS-LSIL Triage Study Design, Methods and Characteristics of Trial Participants (2000)
Rodr Iguez, A.C., et al.: Cervical cancer incidence after screening with HPV, cytology, and visual methods: 18-Year follow-up of the Guanacaste cohort. https://doi.org/10.1002/ijc.30614
Herrero, R., et al.: Design and methods of a population-based natural history study of cervical neoplasia in a rural province of Costa Rica: the Guanacaste Project 1 (1997)
Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M., Kalinin, A.A.: Albumentations: fast and flexible image augmentations. Inf. 11(2), 125 (2020). https://doi.org/10.3390/info11020125
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet Classification with Deep Convolutional Neural Networks. http://code.google.com/p/cuda-convnet/. Accessed 28 Feb 2021
Acknowledgment
This work was supported by the Intramural Research Program of the National Library of Medicine, part of the National Institutes of Health. Data used in this research was by agreement between the National Library of Medicine and the National Cancer Institute (NCI). We are grateful to Dr. Mark Schiffman and his team at the NCI for feedback on our findings.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply
About this paper
Cite this paper
Angara, S., Guo, P., Xue, Z., Antani, S. (2022). An Empirical Study of Vision Transformers for Cervical Precancer Detection. In: Santosh, K., Hegadi, R., Pal, U. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2021. Communications in Computer and Information Science, vol 1576. Springer, Cham. https://doi.org/10.1007/978-3-031-07005-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-07005-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07004-4
Online ISBN: 978-3-031-07005-1
eBook Packages: Computer ScienceComputer Science (R0)