Elsevier

Medical Image Analysis

Volume 65, October 2020, 101795
Medical Image Analysis

Multi-task multi-modal learning for joint diagnosis and prognosis of human cancers

https://doi.org/10.1016/j.media.2020.101795Get rights and content

Highlights

  • Consider the inherent correlation between diagnosis and prognosis tasks and propose a novel multi-task multi-modal learning framework for joint diagnosis and prognosis of human cancer.

  • Integrate histopathological image and genomic data for the diagnosis and prognosis of human cancers.

  • Conduct experiments on three cancer cohorts from the TCGA database that can validate the effectiveness of the proposed method.

  • In-depth explanation of the selected multi-modal biomarkers.

Abstract

With the tremendous development of artificial intelligence, many machine learning algorithms have been applied to the diagnosis of human cancers. Recently, rather than predicting categorical variables (e.g., stages and subtypes) as in cancer diagnosis, several prognosis prediction models basing on patients’ survival information have been adopted to estimate the clinical outcome of cancer patients. However, most existing studies treat the diagnosis and prognosis tasks separately. In fact, the diagnosis information (e.g., TNM Stages) indicates the extent of the disease severity that is highly correlated with the patients’ survival. While the diagnosis is largely made based on histopathological images, recent studies have also demonstrated that integrative analysis of histopathological images and genomic data can hold great promise for improving the diagnosis and prognosis of cancers. However, direct combination of these two types of data may bring redundant features that will negatively affect the prediction performance. Therefore, it is necessary to select informative features from the derived multi-modal data. Based on the above considerations, we propose a multi-task multi-modal feature selection method for joint diagnosis and prognosis of cancers. Specifically, we make use of the task relationship learning framework to automatically discover the relationships between the diagnosis and prognosis tasks, through which we can identify important image and genomics features for both tasks. In addition, we add a regularization term to ensure that the correlation within the multi-modal data can be captured. We evaluate our method on three cancer datasets from The Cancer Genome Atlas project, and the experimental results verify that our method can achieve better performance on both diagnosis and prognosis tasks than the related methods.

Graphical abstract

The framework of our study is consisted of three steps. Firstly, extracting imaging and eigengene features from the histopathological image and gene expression data, respectively. Secondly, implementing the proposed multi-task multi-modal feature selection algorithm (i.e., M2DP) to identify diagnosis and prognosis related features. Thirdly, based on the selected features of each patient, applying AdaBoosting and Cox proportional hazard models for the diagnosis and prognosis prediction of cancer patients, respectively.

  1. Download : Download high-res image (136KB)
  2. Download : Download full-size image

Introduction

Cancer is the leading cause of death in economically developed countries and the second leading cause of death in developing countries (Siegel et al., 2016). It is estimated that the number of new cases of cancer will be 539.2 per 10,000 people by the year of 2025 (Siegel et al., 2016). Thus, accurate diagnosis of cancer especially at its early stage is particularly important. So far, many biomarkers have been shown to be sensitive to the diagnosis of cancers. For example, quite a number of cancer diagnosis models (Nir, Hor, Karimi, Fazli, 2018, Coudray, Sakellaropoulos, Narula, Snuderl, 2018, Gecer, Aksoy, Mercan, Shapiro, 2018) were based on histopathological images, since it can reveal the morphological characteristics of cells that are closely related to the aggressiveness of cancers. Besides histopathological images, it has been known that the genetic mutations and gene expression levels can affect the development of cancers by accelerating cell division rates (Kim and Kaelin, 2004) and modifying the tumor micro-environment (Yuan et al., 2012). Accordingly, many researchers also used genomic features such as gene expression signatures to drive diagnosis practices (Wilhelm, Veltman, Kovacs, Waldman, 2002, Yang, Liu, Pang, 2018, Niazi, Khalid, 2016). In all these diagnosis methods, classification models are learned from training samples to predict categorical variables (e.g., TNM Stage) on the testing subjects.

In addition to predict categorical variables (i.e., TNM Stage) as in cancer diagnosis, many prognosis prediction models were also adopted to perform survival analysis based on different modalities of biomarkers (Lin, Wei, Ying, 1993, Cheng, Mo, Wang, 2017, Cooperberg, Davicioni, Crisan, 2015, Li, Wang, Ye, 2016, Zhu, Yao, Huang, 2016, Yi, Tang, Zhang, 2018, Veer, Dai, 2002). Different from the diagnosis task that focuses on the identification of current disease state, the prognosis task aims at making an prediction about the expected clinical outcome of the cancer patients. Among all the prognosis prediction models, Cox proportional hazard model (Lin et al., 1993) was the most popular one. Cheng et al. (2017a) and Cooperberg et al. (2015) used the Cox model to stratify cancer patients into subgroups with different predicted outcomes from histopathological images and genomic data, respectively. Besides the Cox model, two recent studies MTLSA (Li et al., 2016) and DeepSurv (Zhu et al., 2016), were designed to model the complex relationship between the input feature and clinical outcome. Other studies include (Yi et al., 2018) have proposed a hierarchical regression model to estimate the survival risks of different patients, and experimental results on the high-dimensional genomic data validate its superiority over the comparing methods.

Despite these progress, to the best of our knowledge, all the existing studies treated diagnosis and prognosis tasks independently, without considering the inherent correlation between them. As a matter of fact, the diagnosis information indicates the extent of the disease severity, which is highly correlated with the patients’ clinical outcomes (Scarpa et al., 2010). For example, patients in stage II suffer from more aggressive cancers than those in stage I, and thus generally have higher risk for short survival time. It can be expected that better prediction performance will be achieved if we learn the diagnosis and prognosis tasks jointly, for the information of one task will be helpful to predict another related task.

At the same time, existing studies (Sun, Li, 2018, Yao, Huang, 2017, Shao, Cheng, Zhang, Huang, 2018, Cheng, Zhang, Han, 2017, Yuan, Failmezger, Rueda, 2012, Huang, Zhan, Xiang, 2019) have demonstrated that the integrative analysis of images and genomic data hold great promise for cancer assessment and risk prediction. For example, Yuan et al. (2012) demonstrated that integration of lymphocyte morphology from histopathology images and gene expression signatures can significantly increase the prognosis accuracy for ER-negative breast cancer patients. Sun and Li (2018) combined the pathological images with gene expression data to classify longer-term and shorter-term survivors on breast cancer cohort. Yao and Huang (2017) developed a novel deep learning framework integrating both image and genomic data to predict the clinical outcome of cancer patients. However, direct combination of multi-modal data will increase the feature dimension, which may cause the problem of ”curse of dimensionality”, given the limited training samples in cancer research. Thus, feature selection, which can be considered as bio-marker identification, has become an important step for the diagnosis and prognosis of cancers. Currently, most of the existing studies (Cheng, Zhang, Han, 2017, Yuan, Failmezger, Rueda, 2012) first concatenated all features from histopathological images and genomic data into a long feature vector, followed by the traditional single-modality sparse learning algorithm (e.g., LASSO) for key components discovery. However, these feature selection methods overlooked the correlation within the multi-modal data, which has been widely accepted as a critical component in the state-of-the-art multi-modality based machine learning methods (Liu, Chen, Shen, 2014, Mohammadi, Hossein, Soltanian, 2017).

Inspired by the above considerations, we propose a multi-task multi-modal feature selection method (M2DP) for joint diagnosis and prognosis of cancers. Specifically, based on the task relationship learning framework, our method can automatically derive the correlation between the diagnosis and prognosis tasks, without assuming it to be known in advance. Intuitively, exploiting such task relationships can help identify a subset of bio-makers for a specific task with the knowledge of other related task. In addition, we also consider the association between different modalities by adding a regularization term to capture the inter-correlation between the selected imaging and genomic components.

To evaluate the effectiveness of the proposed method, we perform experiments on three large cancer cohorts (i.e., Lung Squamous Cell Carcinoma, Breast Invasive Carcinoma, Liver Hepatocellular Carcinoma) in The Cancer Genome Atlas (TCGA). The experimental results verify that our proposed M2DP method not only can achieve better performance on both diagnosis and prognosis tasks than competing algorithms, but also help to discover useful histopathological image and genomic bio-markers for the prediction of the development of cancers.

Our preliminary work that only using the diagnosis information can help achieve better prognosis prediction performance was published in MICCAI 2019 (Shao et al., 2019). In this substantially expanded journal paper, we offered new contributions in the following aspects: 1) further demonstrating that the proposed model can also promote the prediction performance for diagnosis tasks; 2) evaluating the effectiveness of the proposed method on two additional datasets (i.e., Breast Invasive Carcinoma and Liver Hepatocellular Carcinoma datasets); 3) in-depth explanation of the bio-markers that are identified by the proposed model. 4) visualization of the selected image features in both high and low survival risk groups. 5) discussion on effects of parameter σ in the proposed M2DP model.

Section snippets

Datasets.

The Cancer Genome Atlas (TCGA) is a large consortium project that has generated genomic and imaging data for thousands of tumor samples across more than 30 types of cancers (Zhu et al., 2014). In this study, we test our method on three early-stage (i.e., stage I and stage II) cancer cohorts including Lung Squamous Cell Carcinoma (LUSC), Breast Invasive Carcinoma (BRCA) and Liver Hepatocellular Carcinoma (LIHC) from TCGA, since the diagnosis and prognosis of early-stage cancer patients are

Experimental settings.

To evaluate the performance of the proposed M2DP method for the diagnosis and prognosis of cancer patients, we test it on three early-stage cancer cohorts (i.e., LUSC, BRCA and LIHC) derived from the TCGA database. For each cohort, we randomly partition it into 5 folds. Here, we enforce the ratio of Stage I and Censored patients in each fold approximate to that in the whole cohort with at most  ± 0.05 gap, and we show the ratio of Stage I and censored patients in each fold in Tables S4-S6 in

The effect of the parameter σ in M2DP model.

In the proposed M2DP model, we fix the parameter σ in the weight of the survival loss (i.e., shown in Eq.  (2)) as a constant (σ=1.5) that is larger than 1. In this section, we investigate the effect of tuning σ in the M2DP model. Specifically, we vary parameter σ from {0.5, 1.5, 2, 2.5}, and record their corresponding Accuracies and Concordance Indexes for both diagnosis and prognosis tasks, respectively. The results are shown in Table 8. As can be seen from Table 8, on one hand, the

Conclusion

In this paper, we develop M2DP, an effective multi-task multi-modal feature selection method that can jointly identify diagnosis and prognosis associated bio-markers from both histopathological image and gene expression data. The main advantage of our approach is its capability of utilizing the inherent correlation within different tasks to guide the feature selection process, which can more accurately diagnose cancer stage and predict the clinical outcome for different types of cancer

CRediT authorship contribution statement

Wei Shao: Conceptualization, Methodology, Writing - original draft. Tongxin Wang: Methodology. Liang Sun: Methodology. Tianhan Dong: Validation. Zhi Han: Writing - original draft. Zhi Huang: Conceptualization. Jie Zhang: Writing - review & editing. Daoqiang Zhang: Supervision. Kun Huang: Supervision.

Declaration of Competing Interest

Dear Editor-in-Chief:

We would like to submit the enclosed manuscript ”Multi-task Multi-modal Learning for Joint Diagnosis and Prognosis of Human Cancers”, which is invited to submit to Medical Image Analysis as the extension of our MICCAI 2019 paper (i.e., Diagnosis-Guided Multi-modal Feature Selection for Prognosis Prediction of Lung Squamous Cell Carcinoma).

All authors have approved this submission and confirm there are no conflicts of interest with the requested reviewers.

Best regards,

Wei

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. 61902183, 61876082, 61861130366, 61732006) and National Key R&D Programme of China (Grant Nos. 2018YFC2001600, 2018YFC2001602), the Royal Society-Academy of Medical Sciences Newton Advanced Fellowship (No. NAF\R1\180371), and the IU Precision Health Initiative Program.

References (49)

  • T. Ashton et al.

    Oxidative phosphorylation as an emerging target in cancer therapy

    Clin. Cancer Res.

    (2018)
  • H.C. Chen et al.

    Assessment of performance of survival prediction models for cancer prognosis

    BMC Med. Res. Methodol.

    (2012)
  • J. Cheng et al.

    Identification of topological features in renal tumor microenvironment associated with patient survival

    Bioinformatics

    (2017)
  • J. Cheng et al.

    Integrative analysis of histopathological images and genomic data predicts clear cell renal cell carcinoma prognosis

    Cancer Res.

    (2017)
  • N. Coudray et al.

    Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning

    Nat. Med.

    (2018)
  • C. Denkert et al.

    Tumor-associated lymphocytes as an independent predictor of response to neoadjuvant chemotherapy in breast cancer

    J. Clin. Oncol.

    (2010)
  • Z. Huang et al.

    SALMON: Survival analysis learning with multi-omics neural networks on breast cancer

    Front Genet.

    (2019)
  • Y. Kim et al.

    Role of VHL gene mutation in human cancer

    J. Clin. Oncol.

    (2004)
  • M. Lerman et al.

    The 630-kb lung cancer homozygous deletion region on human chromosome 3p21. 3: identification and evaluation of the resident candidate tumor suppressor genes

    Cancer Res.

    (2000)
  • Y. Li et al.

    A multi-task learning formulation for survival analysis

    Proceedings of the SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2016)
  • D. Lin et al.

    Checking the cox model with cumulative sums of martingale-based residuals

    Biometrika

    (1993)
  • D.Y. Lin et al.

    The robust inference for the cox proportional hazards model

    J. Am. Stat. Assoc.

    (1989)
  • M. Liu et al.

    Joint binary classifier learning for ECOC-based multi-class classification

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • Y. Liu et al.

    Cancer and innate immune system interactions: translational potentials for cancer immunotherapy

    J. Immunother.

    (1997)
  • Cited by (50)

    • RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval

      2023, Medical Image Analysis
      Citation Excerpt :

      However, visual inspection on the entire WSI is very labor-intensive and time-consuming. Computational pathology based on deep learning technologies has been emerged to facilitate the automation process of pathology diagnoses, such as classification of cancer types (Campanella et al., 2019; Lu et al., 2021; Xue et al., 2021), delineation of cancerous or nuclear regions (Kumar et al., 2017), survival prediction (Shao et al., 2020), image retrieval (Kalra et al., 2020a), etc. Benefiting from the increasing amount of WSIs, WSI retrieval has recently attracted growing attention (Chen et al., 2021; Kalra et al., 2020a,b), which can return a series of similar WSIs from a historically characterized database when given a WSI for a query.

    • A deep learning method for automatic evaluation of diagnostic information from multi-stained histopathological images

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Merveille et al. [35] developed a computerized method to extract pathological attributes from multi-stained consecutive whole slide images (WSIs) via a global multi-channel analysis pipeline. In addition, several multi-modal fusion frameworks based on multi-task correlation learning were established for the integrative analysis of histopathological images and genomic data in cancer prognosis prediction [36,37]. However, to the best of our knowledge, few studies focus on quantitative evaluation of diagnostic information from multi-stained histopathological images due to the difficulty of data acquirement and the lack of quantification algorithms.

    View all citing articles on Scopus
    1

    Wei Shao is now working in the School of Medicine, Indiana University, USA.

    View full text