Skip to main content

Designing and Evaluating Deep Learning Models for Cancer Detection on Gene Expression Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11925))

Abstract

Transcription profiling enables researchers to understand the activity of the genes in various experimental conditions; in human genomics, abnormal gene expression is typically correlated with clinical conditions. An important application is the detection of genes which are most involved in the development of tumors, by contrasting normal and tumor cells of the same patient. Several statistical and machine learning techniques have been applied to cancer detection; more recently, deep learning methods have been attempted, but they have typically failed in meeting the same performance as classical algorithms. In this paper, we design a set of deep learning methods that can achieve similar performance as the best machine learning methods thanks to the use of external information or of data augmentation; we demonstrate this result by comparing the performance of new methods against several baselines.

A. Canakoglu and L. Nanni—Co-primary authors.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://xena.ucsc.edu.

References

  1. Agrawal, S., Agrawal, J.: Neural network techniques for cancer prediction: a survey. Procedia Comput. Sci. 60, 769–774 (2015). https://doi.org/10.1016/j.procs.2015.08.234

    Article  Google Scholar 

  2. Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nature Genet. 25, 25 (2000)

    Article  Google Scholar 

  3. Blagus, R., Lusa, L.: Smote for high-dimensional class-imbalanced data. BMC Bioinform. 14(1), 106 (2013). https://doi.org/10.1186/1471-2105-14-106

    Article  Google Scholar 

  4. Canakoglu, A., et al.: Integrative warehousing of biomolecular information to support complex multi-topic queries for biomedical knowledge discovery. In: BIBE, pp. 1–4. IEEE (2013)

    Google Scholar 

  5. Danaee, P., Ghaeini, R., Hendrix, D.A.: A deep learning approach for cancer detection and relevant gene identification. In: Pacific Symposium on Biocomputing 2017, pp. 219–229. World Scientific (2017)

    Google Scholar 

  6. Furey, T.S., et al.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  7. Golcuk, G., Tuncel, M.A., Canakoglu, A.: Exploiting ladder networks for gene expression classification. In: Rojas, I., Ortuño, F. (eds.) IWBBIO 2018. LNCS, vol. 10813, pp. 270–278. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78723-7_23

    Chapter  Google Scholar 

  8. Guyon, I., et al.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002). https://doi.org/10.1023/A:1012487302797

    Article  MATH  Google Scholar 

  9. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009). https://doi.org/10.1109/TKDE.2008.239

    Article  Google Scholar 

  10. Hijazi, H., Chan, C.: A classification framework applied to cancer gene expression profiles. J. Healthc. Eng. 4(2), 255–284 (2013). https://doi.org/10.1260/2040-2295.4.2.255

    Article  Google Scholar 

  11. LeCun, Y., et al.: Deep learning. Nature 521, 436 (2015)

    Article  MathSciNet  Google Scholar 

  12. Liu, J., Wang, X., Cheng, Y., Zhang, L.: Tumor gene expression data classification via sample expansion-based deep learning. Oncotarget 8(65), 109646–109660 (2017). https://doi.org/10.18632/oncotarget.22762

    Article  Google Scholar 

  13. Lonsdale, J., et al.: The genotype-tissue expression (GTEx) project. Nat. Genet. 45(6), 580 (2013)

    Article  Google Scholar 

  14. Rahimi, A., Gönen, M.: Discriminating early-and late-stage cancers using multiple kernel learning on gene sets. Bioinformatics 34(13), i412–i421 (2018)

    Article  Google Scholar 

  15. Rasmus, A., et al.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems, pp. 3546–3554 (2015)

    Google Scholar 

  16. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    Article  Google Scholar 

  17. Shen, R., Olshen, A.B., Ladanyi, M.: Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 25(22), 2906–2912 (2009). https://doi.org/10.1093/bioinformatics/btp543

    Article  Google Scholar 

  18. Statnikov, A., et al.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 9(1), 319 (2008)

    Article  Google Scholar 

  19. Vincent, P., et al.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(Dec), 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  20. Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Article  Google Scholar 

  21. Yousefi, S., et al.: Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. Sci. Rep. 7(1), 11707 (2017)

    Article  Google Scholar 

  22. Zhang, L., et al.: Gene expression profiles in normal and cancer cells. Science 276(5316), 1268–1272 (1997). https://doi.org/10.1126/science.276.5316.1268

    Article  Google Scholar 

  23. Zuyderduyn, S.D., et al.: A machine learning approach to finding gene expression signatures of the early developmental stages of squamous cell lung carcinoma. Cancer Res. 66(8 Supplement), 431–432 (2006). http://cancerres.aacrjournals.org/content/66/8_Supplement/431.4

Download references

Acknowledgment

This work is supported by the ERC Advanced Grant 693174 (Data-Driven Genomic Computing) and by the Amazon Machine Learning Research Award on Data-driven Machine and Deep Learning for Genomics.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Arif Canakoglu or Luca Nanni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Canakoglu, A., Nanni, L., Sokolovsky, A., Ceri, S. (2020). Designing and Evaluating Deep Learning Models for Cancer Detection on Gene Expression Data. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2018. Lecture Notes in Computer Science(), vol 11925. Springer, Cham. https://doi.org/10.1007/978-3-030-34585-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34585-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34584-6

  • Online ISBN: 978-3-030-34585-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics