Skip to main content

Differential Gene Expression Analysis of the Most Relevant Genes for Lung Cancer Prediction and Sub-type Classification

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2022)

Abstract

An early diagnosis of cancer is essential for a good prognosis, and the identification of differentially expressed genes can enable a better personalization of the treatment plan that can target those genes in therapy. This work proposes a pipeline that predicts the presence of lung cancer and the subtype allowing the identification of differentially expressed genes for lung cancer adenocarcinoma and squamous cell carcinoma subtypes. A gradient boosted tree model is used for the classification tasks based on RNA-seq data. The analysis of gene expressions that better differentiate cancerous from normal tissue, and features that distinguish between lung subtypes is the main focus of the present work. Differential expressed genes are analyzed by performing hierarchical clustering in order to identify gene signatures that are commonly regulated and biological signatures associated with a specific subtype. This analysis highlighted patterns of commonly regulated genes already known in the literature as cancer or subtype-specific genes, and others that are not yet documented in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahn, T., et al.: Deep learning-based identification of cancer or normal tissue using gene expression data, pp. 1748–1752 (2018). https://doi.org/10.1109/BIBM.2018.8621108

  2. Altekruse, S.F., et al.: SEER Cancer Statistics Review 1975–2007 National Cancer Institute. Cancer, pp. 1975–2007 (2010)

    Google Scholar 

  3. Arranz, E.E., Vara, J.Á.F., Gámez-Pozo, A., Zamora, P.: Gene signatures in breast cancer: current and future uses. Transl. Oncol. 5(6), 398–403 (2012). https://doi.org/10.1593/tlo.12244

    Article  Google Scholar 

  4. Arroyo Varela, M., et al.: Comparative gene expression analysis in lung cancer. Europ. Respiratory J. 52(suppl 62), PA2797 (2018). https://doi.org/10.1183/13993003.congress-2018.PA2797. http://erj.ersjournals.com/content/52/suppl_62/PA2797.abstract

  5. Danaee, P., Ghaeini, R., Hendrix, D.A.: A deep learning approach for cancer detection and relevant gene identification. In: Pacific Symposium on Biocomputing (212679), pp. 219–229 (2017). https://doi.org/10.1142/9789813207813_0022

  6. Duhig, E., et al.: Network, CGenome Atl,: Comprehensive molecular profiling of lung adenocarcinoma: the cancer genome atlas research network. Nature 511(7511), 543–550 (2014). https://doi.org/10.1038/nature13385

    Article  Google Scholar 

  7. Grant, G.R., Manduchi, E., Stoeckert, C.J.: Analysis and management of microarray gene expression data. Current protocols in molecular biology Chapter 19, Unit 19.6, January 2007. https://doi.org/10.1002/0471142727.mb1906s77

  8. Inamura, K.: Lung cancer: understanding its molecular pathology and the 2015 wHO classification. Front. Oncol. 7, 1–7 (2017). https://doi.org/10.3389/fonc.2017.00193

  9. Li, B., et al.: Mir-629-3p-induced downregulation of SFTPC promotes cell proliferation and predicts poor survival in lung adenocarcinoma. Artif. Cells Nanomed. Biotechnol. 47(1), 3286–3296 (2019). https://doi.org/10.1080/21691401.2019.1648283. pMID: 31379200

  10. Li, Z., et al.: MACC1 overexpression in carcinoma-associated fibroblasts induces the invasion of lung adenocarcinoma cells via paracrine signaling. Int. J. Oncol. 54(4), 1367–1375 (2019). https://doi.org/10.3892/ijo.2019.4702

    Article  Google Scholar 

  11. Liang, M., Li, Z., Chen, T., Zeng, J.: Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinf. 12(4), 928–937 (2015). https://doi.org/10.1109/TCBB.2014.2377729

    Article  Google Scholar 

  12. O’Leary, N.A., et al.: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44(D1), D733–D745 (2015). https://doi.org/10.1093/nar/gkv1189

  13. Qian, Y., et al.: Prognostic cancer gene expression signatures: current status and challenges. Cells 10(3), 648 (2021). https://doi.org/10.3390/cells10030648

    Article  Google Scholar 

  14. Ramos, B., Pereira, T., Moranguinho, J., Morgado, J., Costa, J.L., Oliveira, H.P.: An interpretable approach for lung cancer prediction and subtype classification using gene expression. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pp. 1707–1710 (2021). https://doi.org/10.1109/EMBC46164.2021.9630775

  15. Shriwash, N., Singh, P., Arora, S., Ali, S.M., Ali, S., Dohare, R.: Identification of differentially expressed genes in small and non-small cell lung cancer based on meta-analysis of MRNA. Heliyon 5(6), e01707 (2019). https://doi.org/10.1016/j.heliyon.2019.e01707

    Article  Google Scholar 

  16. The Cancer Genome Atlas Research Network: Comprehensive genomic characterization of squamous cell lung cancers. Nature 489(7417), 519–525 (2012). https://doi.org/10.1038/nature11404

  17. Uhlén, M., et al.: Tissue-based map of the human proteome. Science 347(6220), 1260419 (2015). https://doi.org/10.1126/science.1260419. https://www.science.org/doi/abs/10.1126/science.1260419

  18. Yang, R., Zhou, Y., Du, C., Wu, Y.: Bioinformatics analysis of differentially expressed genes in tumor and paracancerous tissues of patients with lung adenocarcinoma. J. Thoracic Disease 12(12) (2020). https://jtd.amegroups.com/article/view/47626

  19. Ye, X., Zhang, W., Sakurai, T.: Adaptive unsupervised feature learning for gene signature identification in non-small-cell lung cancer. IEEE Access 8, 154354–154362, e01707 (2020). https://doi.org/10.1109/ACCESS.2020.3018480

  20. Zhong Wang, M.G., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009). https://doi.org/10.1038/nrg2484

    Article  Google Scholar 

Download references

Acknowledgment

This work was partially funded by the Project TAMI - Transparent Artificial Medical Intelligence (NORTE-01-0247-FEDER-045905) financed by ERDF - European Regional Fund through the North Portugal Regional Operational Program - NORTE 2020 and by the Portuguese Foundation for Science and Technology - FCT under the CMU - Portugal International Partnership.

This work is also financed by National Funds through the Portuguese funding agency, FCT-Fundação para a Ciência e a Tecnologia, within a PhD Grant Number: 2021.05767.BD.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hélder P. Oliveira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramos, B., Pereira, T., Silva, F., Costa, J.L., Oliveira, H.P. (2022). Differential Gene Expression Analysis of the Most Relevant Genes for Lung Cancer Prediction and Sub-type Classification. In: Pinho, A.J., Georgieva, P., Teixeira, L.F., Sánchez, J.A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2022. Lecture Notes in Computer Science, vol 13256. Springer, Cham. https://doi.org/10.1007/978-3-031-04881-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04881-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04880-7

  • Online ISBN: 978-3-031-04881-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics