Abstract
Cervical cell classification has important clinical significance in cervical cancer screening at early stages. However, there are fewer public cervical cancer smear cell datasets, the weights of each classes’ samples are unbalanced, the image quality is uneven, and the classification research results based on CNN tend to overfit. To solve the above problems, we propose a cervical cell image generation model based on taming transformers (CCG-taming transformers) to provide high-quality cervical cancer datasets with sufficient samples and balanced weights, we improve the encoder structure by introducing SE-block and MultiRes-block to improve the ability to extract information from cervical cancer cells images; we introduce Layer Normlization to standardize the data, which is convenient for the subsequent non-linear processing of the data by the ReLU activation function in feed forward; we also introduce SMOTE-Tomek Links to balance the source data set and the number of samples and weights of the images we use Tokens-to-Token Vision Transformers (T2T-ViT) combing transfer learning to classify the cervical cancer smear cell image dataset to improve the classification performance. Classification experiments using the model proposed in this paper are performed on three public cervical cancer datasets, the classification accuracy in the liquid-based cytology Pap smear dataset (4-class), SIPAKMeD (5-class), and Herlev (7-class) are 98.79%, 99.58%, and 99.88%, respectively. The quality of the images we generated on these three data sets is very close to the source data set, the final averaged inception score (IS), Fréchet inception distance (FID), Recall and Precision are 3.75, 0.71, 0.32 and 0.65 respectively. Our method improves the accuracy of cervical cancer smear cell classification, provides more cervical cell sample images for cervical cancer-related research, and assists gynecologists to judge and diagnose different types of cervical cancer cells and analyze cervical cancer cells at different stages, which are difficult to distinguish. This paper applies the transformer to the generation and recognition of cervical cancer cell images for the first time.


















Similar content being viewed by others
References
Bissoto A, Valle E, Avila S (2019) The Six Fronts of the Generative Adversarial Networks, arXiv preprint arXiv:1910.13076
Bora K, Chowdhury M, Mahanta LB, Kundu MK, Das AK (2016) Pap smear image classification using convolutional neural network. In: Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing, pp 1–8
Bora K, Chowdhury M, Mahanta LB, Kundu MK, Das AK (2017) Automated classification of pap smear images to detect cervical dysplasia. Comput Methods Prog Biomed 138:31–47
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision. Springer, pp 213–229
Chankong T, Theera-Umpon N, Auephanwiriyakul S (2014) Automatic cervical cell segmentation and classification in pap smears. Comput Methods Prog Biomed 113:539–556
Chen H, Wang Y, Guo T, Xu C, Deng Y, Liu Z, Ma S, Xu C, Xu C, Gao W (2020) Pre-trained image processing transformer, arXiv preprint arXiv:2012.00364
Chouhan N, Khan A, Shah J, Hussnain M, Khan MW (2021) Deep convolutional neural network and emotional learning based breast Cancer detection using digital mammography. Comput Biol Med 132:104318
Conceição T, Braga C, Rosado L, Vasconcelos MJM (2019) A review of computational methods for cervical cells segmentation and abnormality classification. Int J Mol Sci 20:5114
Denton EL, Chintala S, Fergus R (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. In: Advances in neural information processing systems, pp 1486–1494
Dong N, Zhai M-d, Zhao L, Wu CH (2020) Cervical cell classification based on the CART feature selection algorithm. J Ambient Intell Humaniz Comput 12:1–13
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S (2020) An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929
Dounias G, Bjerregaard B, Jantzen J, Tsakonas A, Ampazis N, Panagi G, Panourgias E (2006) Automated identification of cancerous smears using various competitive intelligent techniques. Oncol Rep 15:1001–1006
Esser P, Rombach R, Ommer B (2020) Taming transformers for high-resolution image synthesis, arXiv preprint arXiv:2012.09841
Gautam S, Jith N, Sao AK, Bhavsar A, Natarajan A (2018) Considerations for a PAP smear image analysis system with CNN features, arXiv preprint arXiv:1806.09025
Ghoneim A, Muhammad G, Hossain MS (2020) Cervical cancer classification using convolutional neural networks and extreme learning machines. Futur Gener Comput Syst 102:643–649
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Guissous AE (2019) Skin Lesion Classification Using Deep Neural Network, arXiv preprint arXiv:1911.07817
Gv KK, Reddy GM (2019) Automatic classification of whole slide pap smear images using CNN with PCA based feature interpretation. CVPR Workshops, pp 1074–1079
Han Y, Wang G (2020) Skeletal bone age prediction based on a deep residual network with spatial transformer. Comput Methods Prog Biomed 197:105754
Han C, Kitamura Y, Kudo A, Ichinose A, Rundo L, Furukawa Y, Umemoto K, Li Y, Nakayama H (2019) Synthesizing diverse lung nodules wherever massively: 3D multi-conditional GAN-based CT image augmentation for object detection. In: 2019 International Conference on 3D Vision (3DV). IEEE, pp 729–737
Haryanto T, Sitanggang IS, Agmalaro MA, Rulaningtyas R (2020) The Utilization of Padding Scheme on Convolutional Neural Network for Cervical Cell Images Classification. In: 2020 International conference on computer engineering, network, and intelligent multimedia (CENIM). IEEE, pp 34–38
He X, Chen Y (2019) Optimized input for CNN-based hyperspectral image classification using spatial transformer network. IEEE Geosci Remote Sens Lett 16:1884–1888
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hussain E, Mahanta LB, Borah H, Das CR (2020) Liquid based-cytology pap smear dataset for automated multi-class diagnosis of pre-cancerous and cervical cancer lesions. Data Brief 30:105589
Ibtehaz N, Rahman MS (2020) MultiResUNet: rethinking the U-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Jaiswal A, Gianchandani N, Singh D, Kumar V, Kaur M (2020) Classification of the COVID-19 infected patients using DenseNet201 based deep transfer learning. J Biomol Struct Dyn 39:1–8
Jantzen J, Norup J, Dounias G, Bjerregaard B (2005) Pap-smear benchmark data for pattern classification. In: Nature inspired Smart Information Systems (NiSIS 2005), pp 1–9
Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D (2011) Global cancer statistics. CA Cancer J Clin 61:69–90
Jie H, Li S, Samuel A, Gang S, Enhua W (2019) Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation, arXiv preprint arXiv:1710.10196
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Khamparia A, Gupta D, de Albuquerque VHC, Sangaiah AK, Jhaveri RH (2020) Internet of health things-driven deep learning system for detection and classification of cervical cells using transfer learning. J Supercomput 76:1–19
Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in Vision: A Survey, arXiv preprint arXiv:2101.01169
Kim K, Naylor B (2012) Practical guide to surgical pathology with cytologic correlation: a text and color atlas. Springer Science & Business Media
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105
Loshchilov I, Hutter F (2016) Sgdr: Stochastic gradient descent with warm restarts, arXiv preprint arXiv:1608.03983
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization, arXiv preprint arXiv:1711.05101
Mamunur Rahaman M, Li C, Yao Y, Kulwa F, Wu X, Li X, Wang Q (2021) DeepCervix: A Deep Learning-based Framework for the Classification of Cervical Cells Using Hybrid Deep Feature Fusion Techniques, arXiv e-prints, arXiv: 2102.12191
Marinakis Y, Dounias G, Jantzen J (2009) Pap smear diagnosis using a hybrid intelligent scheme focusing on genetic algorithm based feature selection and nearest neighbor classification. Comput Biol Med 39:69–78
Papanicolaou GN, Traut HF (1941) The diagnostic value of vaginal smears in carcinoma of the uterus. Am J Obstet Gynecol 42:193–206
Park S, Kim G, Oh Y, Seo JB, Lee SM, Kim JH, Moon S, Lim J-K, Ye JC (2021) Vision transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus, pp arXiv:2103.07055
Peng G, Dong H, Liang T, Li L, Liu J (n.d.) Diagnosis of cervical precancerous lesions based on multimodal feature changes. Comput Biol Med:104209
Plissiti ME, Dimitrakopoulos P, Sfikas G, Nikou C, Krikoni O, Charchanti A (2018) SIPAKMED: A new dataset for feature and image based classification of normal and pathological cervical cells in Pap smear images. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, pp 3144–3148
Pollastri F, Bolelli F, Paredes R, Grana C (2020) Augmenting data with GANs to segment melanoma skin lesions. Multimed Tools Appl 79:15575–15592
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv preprint arXiv:1511.06434
Ramesh S, Sasikala S, Paramanandham N (2021) Segmentation and classification of brain tumors using modified median noise filter and deep learning approaches. Multimed Tools Appl 80:11789–11813
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A (2015) M.J.I.j.o.c.v. Bernstein. Imagenet large scale visual recognition challenge 115:211–252
S. McGuire, World cancer report (2014) Geneva, Switzerland: World Health Organization, international agency for research on cancer, WHO press, 2015. Adv Nutr 7(2016):418–419
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
Shah V, Autee P, Sonawane P (2020) Detection of Melanoma from Skin Lesion Images using Deep Learning Techniques. In: 2020 International Conference on Data Science and Engineering (ICDSE). IEEE, pp 1–8
Shi J, Wang R, Zheng Y, Jiang Z, Zhang H, Yu L (2021) Cervical cell classification with graph convolutional network. Comput Methods Prog Biomed 198:105807
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin
Talo M (2019) Diagnostic classification of cervical cell images from pap smear slides. Acad Perspect Proc 2:1043–1050
Thuy MBH, Hoang VT (2019) Fusing of deep learning, transfer learning and Gan for breast cancer histopathological image classification. In: International Conference on Computer Science, Applied Mathematics and Applications. Springer, pp 255–266
Torrey L, Shavlik J (2010) Transfer learning. In: Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. IGI Global, pp 242–264
Valanarasu JMJ, Oza P, Hacihaliloglu I, Patel VM (2021) Medical Transformer: Gated Axial-Attention for Medical Image Segmentation, arXiv preprint arXiv:2102.10662
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need, arXiv preprint arXiv:1706.03762
Wang Q, Liu S, Chanussot J, Li X (2018) Scene classification with recurrent attention of VHR remote sensing images. IEEE Trans Geosci Remote Sens 57:1155–1167
Wang S-Y, Wang O, Zhang R, Owens A, Efros AA (2020) CNN-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Wang Q, Huang W, Xiong Z, Li X (2020) Looking closer at the scene: multiscale representation learning for remote sensing image scene classification. IEEE Transactions on Neural Networks and Learning Systems
Wang L, Zhang C, Bai R, Li J, Duan H (2020) Heck reaction prediction using a transformer model based on a transfer learning strategy. Chem Commun 56:9368–9371
Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L (2021) Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions, arXiv preprint arXiv:2102.12122
Wieslander H, Forslid G, Bengtsson E, Wahlby C, Hirsch J-M, Runow Stark C, Kecheril Sadanandan S (2017) Deep convolutional neural networks for detecting cellular changes due to malignancy. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 82–89
William W, Ware A, Basaza-Ejiri AH, Obungoloch J (2018) A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images. Comput Methods Prog Biomed 164:15–22
William W, Ware A, Basaza-Ejiri AH, Obungoloch J (2019) Cervical cancer classification from pap-smears using an enhanced fuzzy C-means algorithm. Informatics in Medicine Unlocked 14:23–33
Win KP, Kitjaidure Y, Hamamoto K, Myo Aung T (2020) Computer-assisted screening for cervical Cancer using digital image processing of pap smear images. Appl Sci 10:1800
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Wu M, Yan C, Liu H, Liu Q, Yin Y (2018) Automatic classification of cervical cancer from cytological images by usieng convolutional neural network, bioscience reports, 38
Xie Y, Zhang J, Shen C, Xia Y (2021) CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation, arXiv preprint arXiv:2103.03024
Yuan L, Chen Y, Wang T, Yu W, Shi Y, Tay FE, Feng J, Yan S (2021) Tokens-to-token vit: Training vision transformers from scratch on imagenet, arXiv preprint arXiv:2101.11986
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhao L, Li K, Wang M, Yin J, Zhu E, Wu C, Wang S, Zhu C (2016) Automatic cytoplasm and nuclei segmentation for color cervical smear image using an efficient gap-search MRF. Comput Biol Med 71:46–56
Zhao C, Shuai R, Ma L, Liu W, Hu D, Wu M (2021) Dermoscopy image classification based on StyleGAN and DenseNet201. IEEE Access 9:8659–8679
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable DETR: Deformable Transformers for End-to-End Object Detection, arXiv preprint arXiv:2010.04159
Acknowledgments
Chen Zhao contributed to the writing and editing of the paper and the operation and editing of the code. Renjun Shuai (corresponding author) contributed to technological guidance and provided experimental equipment and major financial support. Li Ma contributed technical support and guidance for the paper concept. Wenjia Liu contributed to the technical guidance, and as a consultant in the medical consultant field, and Menglin Wu contributed to the direction of the paper and the funding of support and related work.
Funding
This work was supported in part by The National Natural Science Foundation of China NO.61701222.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
ESM 1
(DOCX 53 kb)
Rights and permissions
About this article
Cite this article
Zhao, C., Shuai, R., Ma, L. et al. Improving cervical cancer classification with imbalanced datasets combining taming transformers with T2T-ViT. Multimed Tools Appl 81, 24265–24300 (2022). https://doi.org/10.1007/s11042-022-12670-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12670-0