Computational pan-cancer characterization of model-based quantitative transcription regulations dysregulated in regional lymph node metastasis
Introduction
Cancer is the second leading cause of death worldwide, especially in patients diagnosed in the late stages [1]. The projected estimations in 2021 for the United States are 1,898,160 new cases and 608,570 cancer-related deaths [2]. The breast cancer incidence rate has seen a yearly increase of 0.3% during 2012–2016 [3]. Although the overall death rates continue to decline due to new technology in the diagnosis and treatment of cancer, incidence rates are leveling off among males and increasing slightly among females [4]. The current estimated 5-year relative survival rates of men and women for all cancer types combined are 68.5% and 70.1%, respectively [5].
Lymphatic metastasis is an important mechanism for the spread of human cancers [6]. Tumor lymphatics play an essential role in cancer progression and are solely responsible for transporting malignant cells to regional lymph nodes, preceding the systemic lethal spread [7]. For example, breast cancer metastasizes through the lymphovascular system to the regional lymph nodes in the axilla and to both visceral and non-visceral sites [8]. Local lymph node-positive or metastatic diagnosis was observed in most men who eventually died from prostate cancers [9].
Regional lymph node metastasis is thus an important prognostic factor in many cancer types [10,11], which has been used as a risk indicator for gastric cancer patients [12] and has negatively affected the prognosis of prostate cancer [13]. Regional lymph node metastasis is also considered during the treatment decision process [14], whether to remove the nodes as part of the primary cancer resection and prevent future risks or treat distant metastasis in vital organs [15]. Melanoma patients with regional lymph node metastasis usually require systemic and intensive adjuvant therapies [16]. Further, lymph node metastasis is also directly associated with treatment response, local recurrence, and long-term survival of gastric cancer patients [17]. Therefore, the investigation of lymph node metastasis across various cancer types has important clinical significance.
This study formulated the quantitative transcription regulation relationship between an mRNA and multiple transcription factors (TFs) as a regression (mqTrans) model. The mqTrans model of an mRNA was trained in the primary cancer samples, and evaluated for its fluctuations in the regional lymph node metastatic samples. This study established the quantitative landscape of transcription regulation across 18 cancer types, and then screened the changes in regional lymph node metastasis. Only a few metastasis-dysregulated mqTrans models were shared across, suggesting the inherent metastatic heterogeneity in different cancer types. Support from the literature was found for only three of the top-10 ranked metastasis-dysregulated genes in the 18 cancer types.
Section snippets
Datasets and preprocessing
The Cancer Genome Atlas (TCGA) is one of the most comprehensive genomic datasets of multiple cancer types [18]. The RNAseq-based transcriptomic datasets of 18 TCGA cancer types were used in this study. The detailed summary and preprocessing details are summarized in the supplementary section “Dataset summary and preprocessing” and Supplementary Table S1.
Feature selection for the regression models
Feature selection algorithms may substantially reduce the number of features used to train a classification or regression model [19,20]. To
Optimizing the mqTrans models of three cancer types
We selected three datasets, COAD/LUAD/LUSC, to optimize the parameter pThreshold of the mqTrans models, as shown in Fig. 1. The parameter pThreshold was tuned from 0.0 to 1.0 with a step size 0.1, and only the TF features with weights no smaller than pThreshold times the maximal TF weight in the same regression model were kept for further analysis.
Fig. 1 (a) shows that the mqTrans models fluctuate radically when pThreshold <0.3. The dataset COAD increases MeanMAE from 5.46 at pThreshold = 0.1
Conclusion
This study quantitatively investigated the transcription regulations using regression models, and comprehensively screened the genes whose model-based quantitative transcription regulations (mqTrans) were significantly altered in the lymph node metastasis of 18 cancer types. These cancer types only shared a limited number of metastasis-dysregulated mqTrans models, even between subtypes. This suggests an inherent heterogeneity of cancers.
The mqTrans technology also provided a new perspective to
Declaration of competing interest
The authors declared no competing interests.
Acknowledgements
This work was supported by the Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC), the Education Department of Jilin Province (JJKH20180145KJ), and a startup grant from Jilin University. This work was also partially supported by the Bioknow MedAI Institute (BMCPP-2018-001), the High-Performance Computing Center of Jilin University, and the Fundamental Research Funds for the Central Universities, JLU.
Insightful comments from the editor-in-chief and the two
References (50)
- et al.
Diagnostic Characteristics of Lethal Prostate Cancer
(2017) - et al.
Prognostic factors in primary anorectal melanoma: a clinicopathological study of 60 cases in China
Hum. Pathol.
(2018) - et al.
FeSTwo, a two-step feature selection algorithm based on feature engineering and sampling for the chronological age regression problem
Comput. Biol. Med.
(2020) - et al.
Selection of features for patient-independent detection of seizure events using scalp EEG signals
Comput. Biol. Med.
(2020) - et al.
Estimating a person's age from walking over a sensor floor
Comput. Biol. Med.
(2018) - et al.
A deep learning approach for sepsis monitoring via severity score estimation
Comput. Methods Progr. Biomed.
(2021) - et al.
Identifying new associated pleiotropic SNPs with lipids by simultaneous test of multiple longitudinal traits: an Iranian family-based study
Gene
(2019) - et al.
The somatic genomic landscape of chromophobe renal cell carcinoma
Canc. Cell
(2014) - et al.
Expression of the SART3 tumor rejection antigen in renal cell carcinoma
J. Urol.
(2000) - et al.
Modeled Reductions in Late-Stage Cancer with a Multi-Cancer Early Detection Test, Cancer Epidemiology, Biomarkers & Prevention: a Publication of the American Association for Cancer Research, Cosponsored by the American Society of Preventive Oncology
(2020)
Cancer statistics, 2021
CA: Canc. J. Clinc.
Breast cancer statistics
CA: Canc. J. Clinc.
Annual report to the nation on the status of cancer, part I: national cancer statistics
Cancer
Trends in cancer incidence and mortality rates in the United States from 1975 to 2016
Ann. Transl. Med.
Lymphatic metastasis
Canc. Metastasis Rev.
Lymphatic endothelial cell progenitors in the tumor microenvironment
Adv. Exp. Med. Biol.
Breast cancer metastasis through the lympho-vascular system
Clin. Exp. Metastasis
Treatment of the adenocarcinoma of the esophagogastric junction at a single institution in Mexico
Ann. Surg Oncol.
Histopathological predictor for regional lymph node metastasis in gastric cancer
Virchows Arch. : Int. J. Pathol.
Development and validation of hub genes for lymph node metastasis in patients with prostate cancer
J. Cell Mol. Med.
A new model for lymphatic metastasis: development of a variant of the MDA-MB-468 human breast cancer cell line that aggressively metastasizes to lymph nodes
Clin. Exp. Metastasis
Regional lymph node metastases; a singular manifestation of the process of clinical metastases in cancer: contemporary animal research and clinical reports suggest unifying concepts
Ann. Surg Oncol.
Long-term survival in 2,505 patients with melanoma with regional lymph node metastasis
Ann. Surg.
Molecular background of the regional lymph node metastasis of gastric cancer
Oncol. Lett.
Cell-of-Origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer
Cell
Cited by (7)
Deep learning-based transcription factor activity for stratification of breast cancer patients
2022, Biochimica et Biophysica Acta - Gene Regulatory MechanismsCitation Excerpt :The transcription factor activity patterns have been widely used to characterize the functional status of transcriptional regulatory circuits [5] and genomic aberrations in many types of tumors in recent years [6–9]. In addition, transcription factor activity was used to characterize driver somatic mutations and identify new markers of drug response in tumors [10–12]. Thus, transcription factor activities have attracted more and more attention in the area of tumor research [13].