Skip to main content
Log in

Integrating HSICBFO and FWSMOTE algorithm-prediction through risk factors in cervical cancer

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

The prominent objective of cervical carcinoma (CC) prediction lies in the optimal feature selection and balanced data. The problem of majority and minority class samples are solved in the proposed work. The objective of the work lies in solving imbalanced data distribution, and of risk factor validation in cervical cancer prediction. Feature Weighted Synthetic Minority Oversampling Technique (FWSMOTE) algorithm solves the minority class issues. The missing data imputation is performed by the Mode and Median Missing Data imputation. For optimal feature selection, Hilbert–Schmidt Independence Criterion with Bacteria Forage Optimization (HSICBFO) algorithm is implemented. Ensemble Support Vector Machine with Interpolation classifier is used for cancer prediction. Various measures are deployed to analyze the performance of the proposed classifier and produces 94.77%, 93.38%, 93.86%, 94.07%, 93.60% and 93.62% for precision, recall, specificity, F-Measure, accuracy and G-mean that helps in identifying the risk level of cervical carcinoma development and guidance for further diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Anagaw A, Chang Y (2019) A new complement naïve Bayesian approach for biomedical data classification. J Ambient Intell Hum Comput 10:3889–3897. https://doi.org/10.1007/s12652-018-1160-1

    Article  Google Scholar 

  • Chandra B, Gupta M (2011) An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform 44(4):529–535

    Article  Google Scholar 

  • Chen H, Zhu Y, Hu K (2009) Cooperative bacterial foraging optimization. Discret Dyn Nat Soc 815247:1–17

    MathSciNet  Google Scholar 

  • Chen R, Shi YH, Zhang H, Hu JY, Luo Y (2018) Systematic prediction of target genes and pathways in cervical cancer from microRNA expression data. Oncol Lett 15(6):9994–10000

    Google Scholar 

  • Claesen M, Smet FD, Suykens JA, Moor BD (2014) Ensemble SVM: a library for ensemble learning using support vector machines. J Mach Learn Res 15:141–145

    MATH  Google Scholar 

  • Deng SP, Zhu L, Huang DS (2016) Predicting hub genes associated with cervical cancer through gene co-expression networks. IEEE/ACM Trans Comput Biol Bioinf 13(1):27–35

    Article  Google Scholar 

  • DiLeo MV, Strahan GD, Den Bakker M, Hoekenga OA (2011) Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome. PLoS ONE 6(10):e26683

    Article  Google Scholar 

  • Fatlawi HK (2007) Enhanced classification model for cervical cancer dataset based on cost sensitive classifier. Int J Comput Tech 4(4):115–120

    Google Scholar 

  • Fernandes K, Cardoso JS, Fernandes J (2017) Transfer learning with partial observability applied to cervical cancer screening. Proc Iberian Conf Pattern Recognit Image Anal 10255:243–250 (Springer International Publishing AG LNCS)

    Article  MathSciNet  Google Scholar 

  • Geeitha S, Thangamani M (2018) Incorporating EBO-HSIC with SVM for gene selection associated with cervical cancer classification. J Med Syst Springer 42(11):225

    Article  Google Scholar 

  • Geeitha S, Thangamani M (2020) A cognizant study of machine learning in predicting cervical cancer at various levels—a data mining concept. Int J Emerg Technol 11(1):23–28

    Google Scholar 

  • Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    Article  Google Scholar 

  • Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing (ICIC) Springer Berlin Heidelberg Part I, LNCS, Vol. 3644, pp. 878–887

  • Hoi CH, Lyu MR (2004) Group-based relevance feedback with support vector machine ensembles. Proceedings of the 17th International Conference on Pattern Recognition Cambridge UK. Vol. 3, pp. 874–877

  • Hu X, Schwarz JK, Lewis JS Jr, Huettner PC, Rader JS, Deasy JO, Grigsby PW, Wang X (2010) A microRNA expression signature for cervical cancer prognosis. Cancer Res 70(4):1441–1448

    Article  Google Scholar 

  • Huang DS, Yu HJ (2013) Normalized feature vectors: a novel alignment-free sequence comparison method based on the numbers of adjacent amino acids IEEE/ACM Trans. Comput Biol Bioinformat 10(2):457–467

    MathSciNet  Google Scholar 

  • Itahana Y, Han R, Barbier S, Lei Z, Rozen S, Itahana K (2015) The uric acid transporter SLC2A9 is a direct target gene of the tumor suppressor p53 contributing to antioxidant defense. Oncogene 34(14):1799–1810

    Article  Google Scholar 

  • Jeatrakul P, Wong KW, Fung CC (2010) Classification of imbalanced data by combining the complementary neural network and SMOTE algorithm. In International Conference on Neural Information Processing (ICONIP), Springer, Berlin, Heidelberg, part II, LNCS Vol. 6444, pp. 152–159

  • Khan A, Shah R, Imran M et al (2019) An alternative approach to neural network training based on hybrid bio meta-heuristic algorithm. J Ambient Intell Human Comput 10:3821–3830

    Article  Google Scholar 

  • Kori Arga M (2018) Potential biomarkers and therapeutic targets in cervical cancer: Insights from the meta-analysis of transcriptomics data within network biomedicine perspective. PLoS ONE 13(7):e0200717

    Article  Google Scholar 

  • Kour P, Lal M, Panjaliya R, Dogra V, Gupta S (2010) Study of the risk factors associated with cervical caner. Biomed Pharmacol J 3(1):179–182

    Google Scholar 

  • Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinf 9(1):1–13

    Article  Google Scholar 

  • Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5):719–720

    Article  Google Scholar 

  • Lo SL, Chiong R, Cornforth D (2015) Using support vector machine ensembles for target audience classification on Twitter. PLoS ONE 10(4):e0122855

    Article  Google Scholar 

  • Luengo J, Fernández A, Garcia S, Herrera F (2011) Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling. Soft Comput 15(10):1909–1936

    Article  Google Scholar 

  • Ly S, Charles C, Degre A (2013) Different methods for spatial interpolation of rainfall data for operational hydrology and hydrological modeling at watershed scale: a review. Biotechnol Agron Soc Environ 17(2):392–406

    Google Scholar 

  • Maciejewski T, Stefanowski J (2011) Local neighbourhood extension of SMOTE for mining imbalanced data. IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 104–111

  • Martin CM, Astbury K, McEvoy L, O'Toole S, Sheils O, O'Leary JJ (2009) Gene expression profiling in cervical cancer: Identification of novel markers for disease diagnosis and therapy. Methods Mol Biol 511:333–359

    Article  Google Scholar 

  • Melgani F, Bruzzone L (2004) Classification of hyper spectral remote sensing images with support vector machines. IEEE Trans Geo Sci Remote Sens 42(8):1778–1790

    Article  Google Scholar 

  • Nandagopal V, Geeitha S, Vinoth Kumar K, Anbarasi J (2019) Feasible analysis of gene expression—a computational based classification for breast cancer. Measurement (Elsevier) 140:120–125

    Google Scholar 

  • Purnami SW, Khasanah PM, Sumartini SH, Chosuvivatwong V, Sriplung H (2016) Cervical cancer survival prediction using hybrid of SMOTE, CART and smooth support vector machine. In: AIP conference proceedings, AIP Publishing 1723(1)

  • Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM (2004) Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci 101(25):9309–9314

    Article  Google Scholar 

  • Segal E, Friedman N, Koller D, Regev A (2004) A module map showing conditional activity of expression modules in cancer. Nat Genet 36(10):1090–1098

    Article  Google Scholar 

  • Sharma M, Bruni L, Diaz M, Castellsague X, de Sanjose S, Bosch FX, Kim JJ (2013) Using HPV prevalence to predict cervical cancer incidence. Int J Cancer 132(8):1895–1900

    Article  Google Scholar 

  • Singh S, Narayan N, Sinha R, Sinha P, Sinha VP, Upadhye JJ (2018) Awareness about cervical cancer risk factos and symptoms. Int J Reprod Contracept Obstet Gynecol 7(12):4987–4991

    Article  Google Scholar 

  • Sorensen L, Nielsen M, Alzheimer's Disease Neuro imaging Initiative (2018) Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination. J Neurosci Methods 302:66–74

    Article  Google Scholar 

  • Tan MS, Chang SW, Cheah PL, Yap HJ (2018) Integrative machine learning analysis of multiple gene expression profiles in cervical cancer. PeerJ 6:e5285

    Article  Google Scholar 

  • Tjalma WA, Van Waes TR, Van den Eeden LE, Bogers JJ (2005) Role of human papillomavirus in the carcinogenesis of squamous cell carcinoma and adenocarcinoma of the cervix. Best Pract Res Clin Obstetr Gynaecol 19(4):469–483

    Article  Google Scholar 

  • Van der Laan M, Pollard K, Bryan J (2003) A new partitioning around medoids algorithm. J Stat Comput Simul 73(8):575–584

    Article  MathSciNet  Google Scholar 

  • William TC, DS Miller (2012) Adenocarcinoma of the uterine corpus. Clin Gynecol Oncol, Eight Edition, Elsevier, Philadelphia, PA, ISBN No. 9780323074193, pp. 141–174

  • Wu W, Zhou H (2017) Data-driven diagnosis of cervical cancer with support vector machine-based approaches. IEEE Access. ISSN:2169–3536. Vol. 5, pp.25189–25195

  • Zhang YX, Zhao YL (2016) Pathogenic network analysis predicts candidate genes for cervical cancer. Comput Math Methods Med 3186051:1–8

    Google Scholar 

  • Zheng CH, Zhang L, Ng VTY, Shiu SCK, Huang DS (2011) Molecular pattern discovery based on penalized matrix decomposition. IEEE/ACM Trans Comput Biol Bioinformat 8(6):1592–1603

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Geeitha.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Geeitha, S., Thangamani, M. Integrating HSICBFO and FWSMOTE algorithm-prediction through risk factors in cervical cancer. J Ambient Intell Human Comput 12, 3213–3225 (2021). https://doi.org/10.1007/s12652-020-02194-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02194-6

Keywords

Navigation