Skip to main content

Construction of Gene Network Based on Inter-tumor Heterogeneity for Tumor Type Identification

  • Conference paper
  • First Online:
Book cover Intelligent Computing Theories and Application (ICIC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13394))

Included in the following conference series:

  • 1474 Accesses

Abstract

Tumor heterogeneity is one of the challenges to study malignant tumors. In general, tumors are driven by combinations of mutated genes that vary greatly from patient to patient. Constructing cancer gene networks using omics data is of great significance for understanding the underlying mechanisms of cancer. In this work, we present a method for network embedding based on gene networks constructed by random forest (RFNE). Gene networks are constructed using random forests (RF) by integrating omics data from heterogeneous samples of different cancers. We assume that if genes are located on leaf nodes of the same tree, then their proximity is plus one. Summing all leaf nodes and dividing by the total number of trees in the forest yields the pairwise closeness between genes. A multi-layer weighted graph is then constructed and random roaming is used to capture distant but structurally similar gene pairs in the network. Finally, we use a network embedding approach to code genes and integrate mutation data from 5290 samples of 14 cancer types to construct patient features. We then used the lightGBM classification algorithm to make predictions for patients. Experimental results show that compared with other methods, our method has better performance than other three methods. We can separate patients with specific cancer types into several subtypes by applying an unsupervised clustering method to learn patient features. Among 14 cancer types. In most cancer types, patient subgroups revealed by RFNE are significantly linked with patient survival.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Code Availability:

The code of RFNE is available on GitHub: github.com/Feng Li12/RFNE.

References

  1. Hanahan, D., Weinberg, R.A.: Hallmarks of cancer: the next generation. Cell 144(5), 646–674 (2011)

    Article  Google Scholar 

  2. Dobin, A., et al.: STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1), 15–21 (2013)

    Article  Google Scholar 

  3. Anders, S., Pyl, P.T., Huber, W.: HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2), 166–169 (2015)

    Article  Google Scholar 

  4. Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Article  Google Scholar 

  5. Zhao, L., Lee, V.H., Ng, M.K., Yan, H., Bijlsma, M.F.: Molecular subtyping of cancer: current status and moving toward clinical applications. Brief. Bioinform. 20(2), 572–584 (2019)

    Article  Google Scholar 

  6. Cheng, F., Jia, P., Wang, Q., Lin, C.-C., Li, W.-H., Zhao, Z.: Studying tumorigenesis through network evolution and somatic mutational perturbations in the cancer interactome. Mol. Biol. Evol. 31(8), 2156–2169 (2014)

    Article  Google Scholar 

  7. Liu, H., Zhao, R., Fang, H., Cheng, F., Fu, Y., Liu, Y.-Y.: Entropy-based consensus clustering for patient stratification. Bioinformatics 33(17), 2691–2698 (2017)

    Article  Google Scholar 

  8. Network, C.G.A.R.: Integrated genomic analyses of ovarian carcinoma. Nature 474(7353), 609 (2011)

    Article  Google Scholar 

  9. Levine, D.A.: Integrated genomic characterization of endometrial carcinoma. Nature 497(7447), 67–73 (2013)

    Article  Google Scholar 

  10. Esteva, F.J., et al.: Prognostic role of a multigene reverse transcriptase-PCR assay in patients with node-negative breast cancer not receiving adjuvant systemic therapy. Clin. Cancer Res. 11(9), 3315–3319 (2005)

    Article  Google Scholar 

  11. Hoadley, K.A., et al.: Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158(4), 929–944 (2014)

    Article  Google Scholar 

  12. Koboldt, D., et al.: Comprehensive molecular portraits of human breast tumours. Nature 490(7418), 61–70 (2012)

    Article  Google Scholar 

  13. Cheng, F., et al.: A gene gravity model for the evolution of cancer genomes: a study of 3,000 cancer genomes across 9 cancer types. PLoS Comput. Biol. 11(9), e1004497 (2015)

    Article  Google Scholar 

  14. Hofree, M., Shen, J.P., Carter, H., Gross, A., Ideker, T.: Network-based stratification of tumor mutations. Nat Methods 10(11), 1108–1115 (2013)

    Article  Google Scholar 

  15. Liu, C., Han, Z., Zhang, Z.-K., Nussinov, R., Cheng, F.: A network-based deep learning methodology for stratification of tumor mutations. Bioinformatics 37(1), 82–88 (2021)

    Article  Google Scholar 

  16. Liu, C., et al.: Computational network biology: data, models, and applications. Phys. Rep. 846, 1–66 (2020)

    Article  MathSciNet  Google Scholar 

  17. Peng, J., Guan, J., Shang, X.: Predicting Parkinson’s disease genes based on node2vec and autoencoder. Front. Genet. 10, 226 (2019)

    Article  Google Scholar 

  18. Zeng, X., et al.: Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. 11(7), 1775–1797 (2020)

    Article  Google Scholar 

  19. Zong, N., Kim, H., Ngo, V., Harismendy, O.: Deep mining heterogeneous networks of biomedical linked data to predict novel drug–target associations. Bioinformatics 33(15), 2337–2344 (2017)

    Article  Google Scholar 

  20. Wang, B., et al.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333–337 (2014)

    Article  Google Scholar 

  21. Lee, J.-H., et al.: Integrative analysis of mutational and transcriptional profiles reveals driver mutations of metastatic breast cancers. Cell Discov. 2(1), 1–14 (2016)

    Article  Google Scholar 

  22. Breiman, L.: Random forests. Mach Learn 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  23. Chen, X., Liu, X.: A weighted bagging LightGBM model for potential lncRNA-disease association identification. In: Qiao, J., Zhao, X., Pan, L., Zuo, X., Zhang, X., Zhang, Q., Huang, S. (eds.) BIC-TA 2018. CCIS, vol. 951, pp. 307–314. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-2826-8_27

    Chapter  Google Scholar 

  24. Dassun, J.C., Reyes, A., Yokoyama, H., Dolendo, M.: Ordering points to identify the clustering structure algorithm in fingerprint-based age classification. Virtutis Incunabula 2(1), 17–27 (2015)

    Google Scholar 

  25. Tomczak, K., Czerwińska, P., Wiznerowicz, M.: The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 19(1A), A68 (2015)

    Google Scholar 

  26. Zhu, Y., Qiu, P., Ji, Y.: TCGA-assembler: open-source software for retrieving and processing TCGA data. Nat. Methods 11(6), 599–600 (2014)

    Article  Google Scholar 

  27. Freund Y, Mason L: The alternating decision tree learning algorithm. In: icml: 1999. Citeseer: 124–133

    Google Scholar 

  28. Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)

    MathSciNet  Google Scholar 

  29. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)

    Google Scholar 

  30. Ribeiro, L.F., Saverese, P.H., Figueiredo, D.R.: struc2vec: Learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394 (2017)

    Google Scholar 

  31. Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD Workshop, Seattle, WA, USA, pp. 359–370 (1994)

    Google Scholar 

  32. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781 (2013)

    Google Scholar 

  33. Chen, T., et al.: Xgboost: extreme gradient boosting. R package version 04–2 1(4), 1–4 (2015)

    Google Scholar 

  34. Rao, H., et al.: Feature selection based on artificial bee colony and gradient boosting decision tree. Appl. Soft Comput. 74, 634–642 (2019)

    Article  Google Scholar 

  35. Yang, S., Berdine, G.: The receiver operating characteristic (ROC) curve. Southwest Respiratory Critical Care Chronicles 5(19), 34–36 (2017)

    Article  Google Scholar 

  36. He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: Lightgcn: Simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639–648 (2020)

    Google Scholar 

Download references

Funding

This work has been supported by the National Natural Science Foundation of China (61902216, 61972236 and 61972226), and Natural Science Foundation of Shandong Province (No. ZR2018MF013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, Z. et al. (2022). Construction of Gene Network Based on Inter-tumor Heterogeneity for Tumor Type Identification. In: Huang, DS., Jo, KH., Jing, J., Premaratne, P., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2022. Lecture Notes in Computer Science, vol 13394. Springer, Cham. https://doi.org/10.1007/978-3-031-13829-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13829-4_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13828-7

  • Online ISBN: 978-3-031-13829-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics