Abstract
In recent years, conventional artificial method leads to low efficiency in the classification of cervical cell, which requires professional completion. Therefore, the classification process is increasingly dependent on artificial intelligence. The traditional image classification method needs to extract a large number of features. Redundant features cause the recognition speed to be slow, and influence the recognition effect. To address these problems and obtain the highest recognition accuracy with the least number of features, this paper proposes a machine learning method based on feature selection algorithm for cervical cell classification. Firstly, we introduced classification and regression trees (CART) for cell feature selection, which reduces the dimension of input feature attributes. Subsequently, particle swarm optimization (PSO) was used to optimize the hyperparameters of support vector machine (SVM) in this paper, making the SVM model better for classification. Finally, the Herlev dataset was introduced to verify the classification performance. The experimental results show that the proposed algorithm can extract accurate and effective features and obtain high classification accuracy, thus verifying the effectiveness of the proposed algorithm. Moreover, the network structure of the proposed algorithm is relatively simple with a low computation cost, which makes it feasible of further extension to the classification application of other cancer cells.
Similar content being viewed by others
References
Abeel T, Helleputte T, Peer Y et al (2010) Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26:392–398
Boloncanedo V, Remeseiro B (2020) Feature selection in image analysis: a survey. Artif Intell Rev 53:2905–2931
Bora K, Chowdhury M, Mahanta L et al (2017) Automated classification of pap smear images to detect cervical dysplasia. Comput Methods Programs Biomed 138:31–47
Chankong T, Theera-Umpon N, Auephanwiriyakul S (2009) Cervical cell classification using Fourier transform. In: 13th international conference on biomedical engineering, vol 23, pp 476–480
Chankong T, Theera-Umpon N, Auephanwiriyankul S (2014) Automatic cervical cell segmentation and classification in pap smears. Comput Methods Programs Biomed 113:539–556
Chen QX, Liu Q (2010) Textural feature analysis for ultrasound breast tumor images. In: 2010 4th international conference on bioinformatics and biomedical engineering, pp 1–4. https://doi.org/10.1109/ICBBE.2010.5516918
Chen YF, Huang PC, Lin KC et al (2014) Semi-automatic segmentation and classification of pap smear cells. IEEE J Biomed Health Inform 18:94–108
Ferlay J, Soerjomataram I, Dikshit R et al (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136:359–386. https://doi.org/10.1002/ijc.29210
Gasparovicaasite M, Polaka I, Alekseyeva L (2016) The impact of feature selection on the information held in bioinformatics data. Inf Technol Manag Sci 18:115–121. https://doi.org/10.1515/itms-2015-0018
Geurts P, Fillet M, Seny D et al (2005) Proteomic mass spectra classification using decision tree based ensemble methods. Bioinformatics 21:3138–3145
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. Syst Man Cybern 3:610–621
Harimoorthy K, Thangavelu M (2020) Multi-disease prediction model using improved SVM-radial bias technique in healthcare monitoring system. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-019-01652-0
Hemavathi D, Srimathi H (2020) Effective feature selection technique in an integrated environment using enhanced principal component analysis. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-019-01647-x
Hitam NA, Ismail AR, Saeed F (2019) An optimized support vector machine (SVM) based on particle swarm optimization (PSO) for cryptocurrency forecasting. Proc Comput Sci 163:427–433
Hong JG, Cheng DX, Yu GP (1983) Characteristics of cell image. Inf Control 12:28–33
Hu LP (2018) Principal component analysis application (I)-principal component regression analysis. Sichuan Mental Health 2:128–132
Huang CL, Dun JF (2008) A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391
Huang H, Wang ZJ, Chung WY (2019) Efficient parameter selection for support vector machines. Enterprise Inf Syst 13:916–932
Jantzen J, Norup J, Dounias G, Bjerregaard B (2005) Pap-smear benchmark data for pattern classification. In: Proceedings of the European symposium on nature inspired smart information systems, pp 1–9
Kim TW, Koh DH, Park CY (2010) Decision tree of occupational lung cancer using classification and regression analysis. Saf Health Work 1:140–148
Kong AL, Pezzin LE, Nattinger AB (2015) Identifying patterns of breast cancer care provided at high-volume hospitals: a classification and regression tree analysis. Breast Cancer Res Treat 153:689–698
Krzywinski M, Altman N (2017) Classification and regression trees. Nat Methods 14:757–758
Lee IH, Lushington GH, Visvanathan M (2011) A filter-based feature selection approach for identifying potential biomarkers for lung cancer. J Clin Bioinform 1:11–11
Lohrmann C, Luukka P, Jablonskasabuka M, Kauranne T (2018) A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection. Expert Syst Appl 110:216–236
Lorena LHN, Carvalho ACPLF, Lorena AC (2015) Filter feature selection for one-class classification. J Intell Rob Syst 80:227–243
Ma SS, Dong ML, Zhang F (2017) Research on flow cytometry data grouping method based on kernel principal component analysis. J Biomed Eng 1:115–122
Mariarputham EJ, Stephen A (2015) Nominated texture based cervical cancer classification. Comput Math Methods Med 2015:586928–586928
Mavroforakis M, Georgiou HV, Dimitropoulos N et al (2006) Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif Intell Med 37:145–162
Miao JY, Niu LF (2016) A survey on feature selection. Proc Comput Sci 91:919–926
Moslehi F, Haeri A (2020) A novel hybrid wrapper-filter approach based on genetic algorithm, particle swarm optimization for feature subset selection. J Ambient Intell Hum Comput 11:1105–1127
Patel J, Gamit N (2016) A review on feature extraction techniques in content based image retrieval. In: 2016 international conference on wireless communications, signal processing and networking. https://doi.org/10.1109/WiSPNET.2016.7566544
Plissiti ME, Nikou C (2013) A review of automated techniques for cervical cell image analysis and classification. In: Andreaus U, Iacoviello D (eds) Biomedical imaging and computational modeling in biomechanics. Springer, Netherlands, pp 1–18
Raghavan V, Rao RK (2015) A semi-automated morphometric assessment of nuclei in pap smears using ImageJ. J Evol Med Dent Sci 4:5363–5370
Rahman TY, Mahanta LB, Das AK, Sarma JD (2020) Automated oral squamous cell carcinoma identification using shape, texture and color features of whole image strips. Tissue Cell 63:101322–101322
Sajeena TA, Jereesh AS (2015) Automated cervical cancer detection through RGVF segmentation and SVM classification. In: 2015 international conference on computing and network communications. https://doi.org/10.1109/CoCoNet.2015.7411260
Seijopardo B, Portodiaz I, Boloncanedo V, Alonsobetanzos A (2017) Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl-Based Syst 118:124–139
Stewart BW, Wild CP (2014) World cancer report 2014. World Health Organization. https://publications.iarc.fr/Non-Series-Publications/World-Cancer-Reports/World-Cancer-Report-2014
Tsantis S, Cavouras D, Kalatzis I et al (2005) Development of a support vector machine-based image analysis system for assessing the thyroid nodule malignancy risk on ultrasound. Ultrasound Med Biol 31:1451–1459
Wang P, Wang LR, Li YM et al (2019a) Automatic cell nuclei segmentation and classification of cervical pap smear images. Biomed Signal Process Control 48:93–103
Wang XK, Guan SY, Hua L et al (2019b) Classification of spot-welded joint strength using ultrasonic signal time frequency features and PSO-SVM method. Ultrasonics 91:161–169
Wu W, Zhou H (2017) Data-driven diagnosis of cervical cancer with support vector machine-based approaches. IEEE Access 5:25189–25195
Yao MH, Wang N, Qi M, Li Y (2014) Research on improved maximum correlation minimum redundancy feature selection method. Comput Eng Appl 24:116–122
Zhang CJ, Yang YJ, Du ZW, Ma C (2016) Particle swarm optimization algorithm based on ontology model to support cloud computing applications. J Ambient Intell Hum Comput 7:633–638
Zhang L, Lu L, Nogues I et al (2017) Deep pap: deep convolutional networks for cervical cell classification. IEEE J Biomed Health Inform 21:1633–1643
Zhang ZJ, Song FX, Zhang P et al (2018) A new online field feature selection algorithm based on streaming data. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-018-0959-0
Zhao LL, Sun LY, Yin JP (2017) Cervical cell recognition combined with hierarchical method and principal component analysis. J Natl Univ Defense Technol 6:45–50
Zhao M, Wu AG, Song JJ et al (2016) Automatic screening of cervical cells using block image processing. Biomed Eng Online 15:14–14
Acknowledgements
This study was funded by the National Natural Science Foundation of China under Grant 61773282. The authors would like to thank the associate editor and reviewers for their valuable comments and suggestions that improved the paper’s quality. Gratitude is extended to Big Data Intelligence Centre of The Hang Seng University of Hong Kong for supporting the research.
Funding
This study was funded by the National Natural Science Foundation of China (Grant Number 61773282).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Research involving human participants and/or animals
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
This article does not contain patient data.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dong, N., Zhai, Md., Zhao, L. et al. Cervical cell classification based on the CART feature selection algorithm. J Ambient Intell Human Comput 12, 1837–1849 (2021). https://doi.org/10.1007/s12652-020-02256-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02256-9