Gynecological cancer prognosis using machine learning techniques: A systematic review of the last three decades (1990–2022)

https://doi.org/10.1016/j.artmed.2023.102536Get rights and content

Highlights

  • This review evaluates the use of machine learning in gynecological oncology.

  • Definitions, methodologies, study quality, and clinical significance were variable.

  • The shortcomings of ML studies have been identified in this review.

  • Recommendations have been provided to address these shortcomings.

Abstract

Objective

Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs.

Methods

Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly.

Results

Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes.

Conclusion

There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.

Introduction

The National Cancer Institute defines cancer as a “disease in which some of the body's cells grow uncontrollably and spread to other parts of the body” [1]. When such uncontrolled growth occurs in women's reproductive organs or genitals, they are referred to as ‘gynecological cancers’. There are five main types of gynecological cancers (cervical, ovarian, uterine, vaginal and vulval), named after the organ or tissue from which they originate. The most common gynecological malignancies – ovarian, cervical and uterine cancers – present a significant disease burden worldwide [2]. These cancers are prognostically variable. Of these, ovarian cancer has the highest rate of recurrence, at 85 % [3], and lowest rate of five-year survival at 30 % [4]. This is worsened by its non-specific symptomology and frequent late-stage diagnosis [5]. Although some clinical factors are prognostic, such as grade, stage, tumor subtype and debulking surgery success, the most commonly used clinical predictors [5] and biomarkers [6] are inadequate to predict clinical outcomes. Cervical cancer is one of the most common gynecological malignancies and is the fourth highest cause of cancer mortality in women worldwide [2]. Although some high Human Development Index (HDI) nations have had success in reducing the disease burden with screening and prevention programs [7], the prognostication of advanced-stage cervical cancer is variable [8]. Finally, although uterine cancer has a better prognosis than other malignancies, this disease is often very heterogeneous, making prognostication with current methods a challenge [9].

In recent years, prognostication has developed as a major focus in oncology, where decision making is influenced by the predicted probability of future events [10]. Treatment for gynecological cancers depends on the extent to which they have spread and the type of cancer, and includes modalities such as surgery, chemotherapy and radiotherapy. Developing oncological prediction algorithms and decision support tools would be useful for allowing clinicians to choose optimal screening, therapeutic and follow-up pathways for patients. However, challenges arise from the cancers' biological complexity and prognostic variability, alongside the ever-changing clinical, biological and pathological understanding of these malignancies [11].

Healthcare systems and clinicians currently use several tools to screen, diagnose and treat patients; however, current clinical approaches for many malignancies favor clinical staging and histopathological parameters with multivariate modelling showing limited success [12]. To address these shortcomings, machine learning (ML) approaches have been used to facilitate complex prognostic modelling that may outperform traditional methods [13]. ML methods aim to develop predictive algorithms without requiring complete prior rule definition, a valuable approach in complex clinical settings [12]. Predictive systems begin with data that undergoes pre-processing and feature extraction, followed by statistical analysis of extracted features, with selected features producing a classification result (Fig. 1). ML systems can be used at each step of this process. The ML classifiers (shown in Appendix A) used to perform the classification tasks include both unsupervised learning, which draws correlations within a dataset without a directed outcome, and supervised learning methods, including support vector machines (SVMs) and artificial neural networks (ANNs), which are goal-directed toward a particular outcome, regression or classification [12]. ML systems typically undergo training on a ‘training’ dataset and use a ‘validation’ dataset that the system is naïve to, to facilitate assessment of its performance, while still tuning its parameters. Finally, the system is typically exposed to a ‘testing’ dataset that it is naïve to, to facilitate an unbiased assessment of the final model's performance.

There is a paucity of clinical translatability of these methods for gynecological malignancies. Although they have been studied in this setting, they are variable in both approach and success. Systematic reviews have previously broadly summarized artificial intelligence (AI) in gynecologic imaging [14], or the application of ML methods broadly to gynecological cancers [15], [16]. However, to the authors' knowledge, a systematic review of the literature specific to prognostication in gynecological cancer has yet to be performed. Therefore, this study aims to systematically review ML in the prognostication of gynecological malignancies and evaluate the methodologies used.

Section snippets

Search strategy

This study was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [17]. We searched PubMed, Embase, Web of Science, ENGINE, Scopus, IEEE Xplore and ACM Digital Library for studies exploring the use of ML methods for predicting prognosis for gynecological malignancies. Our search query was developed in PubMed using MeSH and keyword terms, and then revised for the other databases (Supplementary Table 1). The final iteration of

Results

The aim of this study was to systematically review the use of ML in the prognostication of gynecological malignancies and evaluate the methodologies used. In total, the initial search yielded 2207 unique papers with 349 papers passing title and abstract screening. 139 papers met all criteria for inclusion in the study [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49],

Discussion

This review aimed to systematically review and evaluate the methodologies used when applying ML to prognosticate common gynecological malignancies. The results show some promise in the field of ovarian, uterine and cervical cancers, and demonstrated discriminate predictive ability and superiority when individual studies compared these methods to non-ML methods. However, there were frequent methodological and reporting shortcomings that limit any conclusions that can be drawn in this review,

Conclusions

It has been shown that the literature features ML models that may improve patient outcomes in the future with discriminate benefits over current methods; however, concerns regarding ROB and applicability exist with the currently available literature. Genomic and clinicopathological predictor variables, in combination with RF and SVM ML methods, have been the most commonly applied tools to date. However, there is significant heterogeneity in the field, and in recent times, unique ML methods have

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to gratefully acknowledge the literature search contributions made by Ms. Jacky Cribb and Ms. Kaye Cumming, and analysis contributions from Ms. Daisy Eunji Cho, Ms. Yukei Oo and Ms. Yu Jin Cha.

CRediT authorship contribution statement

Conceptualization, JS., HR., RA., RG., XT., XZ., YL., and SKC.; methodology, JS., HR., RA., RG., XT., XZ., YL., and SKC.; formal analysis, JS., HR., HWL., and SKC.; data curation, JS., HR., and SKC.; writing—original draft preparation, JS., and HR.; writing—review and editing, JS., HR., RA., HWL., RG., XT., XZ., YL., TG., and SKC.; project administration, JS., HR., and SKC.; funding acquisition, SKC. All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Some of the researchers who completed this work were employed under the Australian government's Rural Health Multidisciplinary Training Program and had full independence for this project.

References (169)

  • H.M. Zolbanin et al.

    Predicting overall survivability in comorbidity of cancers: a data mining approach

    DecisSupport Syst

    (2015)
  • K. Matsuo

    A pilot study in using deep learning to predict limited life expectancy in women with recurrent cervical cancer

    AmJObstetGynecol

    (2017)
  • A.B. Shinagare

    High-grade serous ovarian cancer: use of machine learning to predict abdominopelvic recurrence on CT on the basis of serial cancer antigen 125 levels

    J Am Coll Radiol

    (2018)
  • K. Matsuo

    Survival outcome prediction in cervical cancer: cox models vs deep-learning model

    Am J Obstet Gynecol

    (2019)
  • J. Ruan

    A novel algorithm for network-based prediction of cancer recurrence

    Genomics

    (2019)
  • Q. Wang

    Prognostic potential of alternative splicing markers in endometrial cancer

    Mol Ther Nucleic Acids

    (2019)
  • S. Wang

    Deep learning provides a new computed tomography-based prognostic biomarker for recurrence prediction in high-grade serous ovarian cancer

    Radiother Oncol

    (2019)
  • Y. Li

    A prognostic nomogram integrating novel biomarkers identified by machine learning for cervical squamous cell carcinoma

    J Transl Med

    (2020)
  • D.P. Mysona

    Clinical calculator predictive of chemotherapy benefit in stage 1A uterine papillary serous cancers

    Gynecol Oncol

    (2020)
  • A.M. Praiss

    Using machine learning to create prognostic systems for endometrial cancer

    Gynecol Oncol

    (2020)
  • H. Chai

    Integrating multi-omics data through deep learning for accurate cancer prognosis prediction

    Comput Biol Med

    (2021)
  • H.Z. Chen

    A CT-based radiomics nomogram for predicting early recurrence in patients with high-grade serous ovarian cancer

    Eur J Radiol

    (2021)
  • What is cancer?

    (2021)
  • F. Bray

    Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

    CA Cancer J Clin

    (2018)
  • G. Corrado

    Optimizing treatment in recurrent epithelial ovarian cancer

    Expert Rev Anticancer Ther

    (2017)
  • M. Kyrgiou

    Survival benefits with diverse chemotherapy regimens for ovarian cancer: meta-analysis of multiple treatments

    J Natl Cancer Inst

    (2006)
  • P. Bottoni et al.

    The role of CA 125 as tumor marker: biochemical and clinical aspects

    Adv Exp Med Biol

    (2015)
  • F. Bray

    Incidence trends of adenocarcinoma of the cervix in 13 European countries

    Cancer Epidemiol Biomarkers Prev

    (2005)
  • A.J. Vickers

    Prediction models in cancer care

    CA Cancer J Clin

    (2011)
  • C. Chu

    Prognosticating for adult patients with advanced incurable cancer: a needed oncologist skill

    Curr Treat Options Oncol

    (2020)
  • M. Nagy et al.

    Machine learning in oncology: what should clinicians know?

    JCO ClinCancer Informa

    (2020)
  • A. Liberati

    The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration

    PLoS Med

    (2009)
  • R.F. Wolff

    PROBAST: a tool to assess the risk of bias and applicability of prediction model studies

    Ann Intern Med

    (2019)
  • K.G.M. Moons

    PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration

    Ann Intern Med

    (2019)
  • S. Kehoe

    Artificial neural networks and survival prediction in ovarian carcinoma

    Eur J Gynaecol Oncol

    (2000)
  • P.B. Snow et al.

    Neural network analysis of the prediction of cancer recurrence following debulking laparotomy and chemotherapy in stages III and IV ovarian cancer

    Mol Urol

    (2001)
  • T. Ochi

    Survival prediction using artificial neural networks in patients with uterine cervical cancer treated by radiation therapy alone

    Int J Clin Oncol

    (2002)
  • J.H. Oh

    Proteomic biomarker identification for diagnosis of early relapse in ovarian cancer

    J Bioinform Comput Biol

    (2006)
  • T.Z. Tan et al.

    A prognosis tool based on hemostasis and genetic complementary learning

  • A. Bucinski

    Evaluation of selected prognostic factors in patients with ovarian cancer applying artificial neural network analysis

    AdvClinExpMed

    (2007)
  • Q.H. Tan

    Evolutionary algorithm for feature subset selection in predicting tumor outcomes using microarray data

  • K.X. Zhang et al.

    CAERUS: predicting CAncER outcomes using relationship between protein structural information, protein networks, gene expression data, and mutation data

    PLoS Comput Biol

    (2011)
  • J. Ruan
  • D. Kim

    ATHENA: identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network

    BioData Min

    (2013)
  • C. Coveney

    Exploration of ovarian cancer microarray data focusing on gene expression patterns relevant to survival using artificial neural networks

  • C.-J. Tseng

    Application of machine learning to predict the recurrence-proneness for cervical cancer

    Neural ComputApplic

    (2014)
  • A. Enshaei et al.

    Artificial intelligence systems as prognostic and predictive tools in ovarian cancer

    Ann Surg Oncol

    (2015)
  • H.R. Hassanzadeh et al.

    A semi-supervised method for predicting cancer survival using incomplete clinical data

  • M. Liang

    Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach

    IEEE/ACM Trans Comput Biol Bioinformatics

    (2015)
  • V. Gligorijevic et al.

    Patient-specific data fusion for cancer stratification and personalised treatment

    Pac Symp Biocomput

    (2016)
  • Cited by (0)

    View full text