Skip to main content

Advertisement

Log in

A comparison of deep neural network models for cluster cancer patients through somatic point mutations

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

It is now well-known that genetic mutations contribute to development of tumors, in which at least 15% of cancer patients experience a causative genetic abnormality including De Novo somatic point mutations. This highlights the importance of identifying responsible mutations and the associated biomarkers (e.g., genes) for early detection in high-risk cancer patients. The next-generation sequencing technologies have provided an excellent opportunity for researchers to study associations between De Novo somatic mutations and cancer progression by identifying cancer subtypes and subtype-specific biomarkers. Simple linear classification models have been used for somatic point mutation-based cancer classification (SMCC); however, because of cancer genetic heterogeneity (ranging from 50 to 80%), high data sparsity, and the small number of cancer samples, the simple linear classifiers resulted in poor cancer subtypes classification. In this study, we have evaluated three advanced deep neural network-based classifiers to find and optimized the best model for cancer subtyping. To address the above-mentioned complexity, we have used pre-processing clustered gene filtering (CGF) and indexed sparsity reduction (ISR), regularization methods, a Global-Max-Pooling layer, and an embedding layer. We have evaluated and optimized the three deep learning models CNN, LSTM, and a hybrid model of CNN + LSTM on publicly available TCGA-DeepGene dataset, a re-formulated subset of The Cancer Genome Atlas (TCGA) dataset and tested the performance measurement of these models is 10-cross-validation accuracy. Evaluating all the three models using a same criterion on the test dataset revealed that the CNN, LSTM, and CNN + LSTM have 66.45% accuracy, 40.89% accuracy, and 41.20% accuracy in somatic point mutation-based cancer classification. Based on our results, we propose the CNN model for further experiments on cancer subtyping based on DNA mutations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The dataset analyzed during the current study are available in the DeepGene repository, https://github.com/yuanyc06/deepgene.

References

  • Aarøe J, Lindahl T, Dumeaux V, Sæbø S, Tobin D, Hagen N, Skaane P, Lönneborg A, Sharma P, Børresen-Dale A-L (2010) Gene expression profiling of peripheral blood cells for early detection of breast cancer. Breast Cancer Res 12:1–11

    Google Scholar 

  • Abdel-Hamid O, Mohamed A-R, Jiang H, Deng Li, Penn G, Dong Yu (2014) Convolutional neural networks for speech recognition. IEEE/ACM Trans on Audio, Speech, Lang Process 22:1533–1545

    Google Scholar 

  • Alinejad-Rokny H, Anwar F, Waters SA, Davenport MP, Ebrahimi D (2016) Source of CpG depletion in the HIV-1 genome. Mol Biol Evol 33:3205–3212

    Google Scholar 

  • Alinejad-Rokny H, Ghavami R, Rabiee HR, Rezaei N, Tam KT, Forrest AR (2020) MaxHiC: robust estimation of chromatin interaction frequency in Hi-C and capture Hi-C experiments. bioRxiv 2020(8):15454

    Google Scholar 

  • Asrol M, Papilo P, Gunawan FE (2021) Support vector machine with K-fold validation to improve the industry’s sustainability performance classification. Procedia Computer Sci 179:854–862

    Google Scholar 

  • Balss J, Meyer J, Mueller W, Korshunov A, Hartmann C, von Deimling A (2008) Analysis of the IDH1 codon 132 mutation in brain tumors. Acta Neuropathol 116:597–602

    Google Scholar 

  • Bayati M, Rabiee HR, Mehrbod M, Vafaee F, Ebrahimi D, Forrest AR, Alinejad-Rokny H (2020) CANCERSIGN: a user-friendly and robust tool for identification and classification of mutational signatures and patterns in cancer genomes. Sci Rep 10:1–11

    Google Scholar 

  • Browne RP, McNicholas PD, Sparling MD (2011) Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Trans Pattern Anal Mach Intell 34:814–817

    Google Scholar 

  • Cai Z, Lizhe X, Yi S, Mohammad RS, Randy G, Guohui L. (2006) Using gene clustering to identify discriminatory genes with higher classification accuracy. In Sixth IEEE Symposium on BioInform and BioEng (BIBE'06), 235–42. IEEE

  • Chanu MM, Thongam K (2021) Computer-aided detection of brain tumor from magnetic resonance images using deep learning network. J Ambient Intell Humaniz Comput 12:6911–6922

    Google Scholar 

  • Cheng J-Z, Ni D, Chou Y-H, Qin J, Tiu C-M, Chang Y-C, Huang C-S, Shen D, Chen C-M (2016) Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep 6:1–13

    Google Scholar 

  • Cho J-H, Lee D, Park JH, Lee I-B (2003) New gene selection method for classification of cancer subtypes considering within-class variation. FEBS Lett 551:3–7

    Google Scholar 

  • Chow, Chi K, Hailong Z, Jessica L, Mark WL, Winston PK, Keith C. (2009) A cooperative feature gene extraction algorithm that combines classification and clustering. In 2009 IEEE Int Conf on Bioinform and Biomed Workshop, 197–202. IEEE

  • Ciregan D, Ueli M, Jürgen S. (2012) Multi-column deep neural networks for image classification. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, 3642–49. IEEE

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  • Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A (2018) Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med 24:1559–1567

    Google Scholar 

  • Cruz-Roa A, Gilmore H, Basavanhally A, Feldman M, Ganesan S, Shih NNC, Tomaszewski J, González FA, Madabhushi A (2017) Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent. Sci Rep 7:1–14

    Google Scholar 

  • Dashti H, Dehzangi A, Bayati M, Breen J, Lovell N, Ebrahimi D, Alinejad-Rokny H. (2020) Integrative analysis of mutated genes and mutational processes reveals seven colorectal cancer subtypes. bioRxiv

  • Deepak S, Ameer PM (2021) Automated categorization of brain tumor from mri using cnn features and svm. J Ambient Intell Humaniz Comput 12:8357–8369

    Google Scholar 

  • Donahue J, Yangqing J, Oriol V, Judy H, Ning Z, Eric T, Trevor D. (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In Int Conf on Mach Learn, 647–55. PMLR

  • Ebrahimi D, Alinejad-Rokny H, Davenport MP (2014) Insights into the motif preference of APOBEC3 enzymes. PLoS One 9:e87679

    Google Scholar 

  • Edara DC, Lakshmi PV, Venkatramaphanikumar S, Venkata KKK (2019) Sentiment analysis and text categorization of cancer medical records with LSTM. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01399-8

    Article  Google Scholar 

  • Fateh A, Fateh M, Abolghasemi V (2021) Multilingual handwritten numeral recognition using a robust deep network joint with transfer learning. Inf Sci 581:479–494

    MathSciNet  Google Scholar 

  • Ferlay J, Ervik M, Lam F, Colombet M, Mery L, Piñeros M, Znaor A, Soerjomataram I, Bray F. (2020) Global cancer observatory: cancer today. Lyon: Int Agency Res Cancer; 2018

  • Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12:2451–2471

    Google Scholar 

  • Ghareyazi A, Mohseni A, Dashti H, Beheshti A, Dehzangi A, Rabiee HR, Alinejad-Rokny H (2021) Whole-genome analysis of de novo somatic point mutations reveals novel mutational biomarkers in pancreatic cancer. Cancers 13:4376

    Google Scholar 

  • Gong L, Wang C, Li Xi, Chen H, Zhou X (2018) MALOC: a fully pipelined FPGA accelerator for convolutional neural networks with all layers mapped on chip. IEEE Trans Comput Aided Des Integr Circuits Syst 37:2601–2612

    Google Scholar 

  • Gooneratne SL, Alinejad-Rokny H, Ebrahimi D, Bohn PS, Wiseman RW, O’Connor DH, Kent SJ (2014) Linking pig-tailed macaque major histocompatibility complex class I haplotypes and cytotoxic T lymphocyte escape mutations in simian immunodeficiency virus infection. J Virol 88:14310–14325

    Google Scholar 

  • Habibi M, Weber L, Neves M, Wiegandt DL, Leser U (2017) Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 33:i37–i48

    Google Scholar 

  • He K, Xiangyu Z, Shaoqing R, Jian S. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–78

  • Heidari R, Akbariqomi M, Asgari Y, Ebrahimi D, Alinejad-Rokny H (2021) A systematic review of long non-coding RNAs with a potential role in breast cancer. Mutat Res/Rev Mutat Res 787:108375

    Google Scholar 

  • Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554

    MathSciNet  MATH  Google Scholar 

  • Hinton GE, Nitish S, Alex K, Ilya S, Ruslan RS. (2012) Improving neural networks by preventing co-adaptation of feature detectors, arXiv preprint arXiv:1207.0580

  • Hosseinpoor M, Parvin H, Nejatian S, Rezaie V, Bagherifard K, Dehzangi A, Alinejad-Rokny H (2020) Proposing a novel community detection approach to identify cointeracting genomic regions. Math Biosci Eng 17:2193–2217

    MATH  Google Scholar 

  • Huang J, Vivek R, Chen S, Menglong Z, Anoop K, Alireza F, Ian F, Zbigniew W, Yang S, Sergio G. (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7310–11

  • Huang Z, Huang D, Ni S, Peng Z, Sheng W, Xiang Du (2010) Plasma microRNAs are promising novel biomarkers for early detection of colorectal cancer. Int J Cancer 127:118–126

    Google Scholar 

  • Inan O, Uzer MS (2021) A method of classification performance improvement via a strategy of clustering-based data elimination integrated with k-Fold cross-validation. Arab J Sci Eng 46:1199–1212

    Google Scholar 

  • Jalali Y, Fateh M, Rezvani M, Abolghasemi V, Anisi MH (2021) ResBCDU-Net: a deep learning framework for lung CT image segmentation. Sensors 21:268

    Google Scholar 

  • Javanmard R, JeddiSaravi K, Alinejad-Rokny H (2013) Proposed a new method for rules extraction using artificial neural network and artificial immune system in cancer diagnosis. J Bionanosci 7:665–672

    Google Scholar 

  • Jia AD, Zhengyi Li B, Chuanwang C, Zhang. (2020) Detection of cervical cancer cells based on strong feature CNN-SVM network. Neurocomputing 411:112–127

    Google Scholar 

  • Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Anthony T (2018) Computational intelligence approaches for classification of medical data: state-of-the-art, future challenges and research directions. Neurocomputing 276:2–22 (%J Neurocomputing Chronopoulos)

    Google Scholar 

  • Khan SU, Islam N, Jan Z, Din IU, Rodrigues JJPC (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recogn Lett 125:1–6

    Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  • Kurman RJ, Kala V, Richard R, Wu TC, Ie-Ming S (2008) Early detection and treatment of ovarian cancer: shifting from early stage to minimal volume of disease based on a new model of carcinogenesis. Am J Obstetrics Gyneco 198:351–56

    Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Google Scholar 

  • Liang G, Hong H, Xie W, Zheng L (2018) Combining convolutional neural network with recursive neural network for blood cell image classification. IEEE Access 6:36188–36197

    Google Scholar 

  • Lin M, Qiang C, Shuicheng Y. (2013) Network in network, arXiv preprint arXiv:1312.4400

  • Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26

    Google Scholar 

  • Nguyen LD, Ruihan G, Dongyun L, Zhiping L (2019) Biomedical image classification based on a feature concatenation and ensemble of deep CNNs. J Ambient Intell Human Comput 10:1–13. https://doi.org/10.1007/s12652-019-01276-4

    Article  Google Scholar 

  • Niu H, Khozouie N, Parvin H, Alinejad-Rokny H, Beheshti A, Mahmoudi MR (2020) An ensemble of locally reliable cluster solutions. Appl Sci 10:1891

    Google Scholar 

  • Parvin H, Alinejad-Rokny H, Minaei-Bidgoli B (2011a) Detection of cancer patients using an innovative method for learning at imbalanced datasets. International conference on rough sets and knowledge technology. Springer, Berlin Heidelberg, pp 376–381

    Google Scholar 

  • Parvin H, Minaei B, Alizadeh H, Beigi A (2011b) A novel classifier ensemble method based on class weightening in huge dataset. In international symposium on neural networks. Springer, Heidelberg, pp 144–150

    Google Scholar 

  • Parvin H, MirnabiBaboli M, Alinejad-Rokny H (2015) Proposing a classifier ensemble framework based on classifier selection and decision tree. Eng Appl Artif Intell 37:34–42

    Google Scholar 

  • Qaiser T, Tsang Y-W, Epstein D, Rajpoot N (2017) Tumor segmentation in whole slide images using persistent homology and deep convolutional features. Annual conference on medical image understanding and analysis. Springer, Heidelberg, pp 320–329

    Google Scholar 

  • Rajaei P, Jahanian KH, Beheshti A, Band SS, Dehzangi A, Alinejad-Rokny H (2021) VIRMOTIF: A user-friendly tool for viral sequence analysis. Genes 12:186

    Google Scholar 

  • Renith G, Senthilselvi A (2020) Accuracy improvement in diabetic retinopathy detection using DLIA. J Adv Res Dyn Control Syst 12(4):133–149. https://doi.org/10.5373/JARDCS/V12I4/20201426

    Article  Google Scholar 

  • Sankareswaran SP, Krishnan M (2022) Unsupervised end-to-end brain tumor magnetic resonance image registration using RBCNN: rigid transformation, B-spline transformation and convolutional neural network. Curr Med Imaging 18(4):387–397

    Google Scholar 

  • Shamshirband S, Mahdis F, Abdollah D, Anthony TC, Hamid A-R (2021) A review on deep learning approaches in healthcare systems: taxonomies, challenges, and open issues. J Biomed Inform 113:103627

    Google Scholar 

  • Sharif RA, Hossein A, Josephine S, Stefan C. (2014) CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 806–13

  • Sharifrazi D, Alizadehsani R, Joloudari JH, Shamshirband S, Hussain S, Sani ZA, Alinejad-Rokny H. (2020) CNN-KCL: automatic myocarditis diagnosis using convolutional neural network combined with k-means clustering, preprints, 2020

  • Shaukat F, Raja G, Ashraf R, Khalid S, Ahmad M, Ali A (2019) Artificial neural network based classification of lung nodules in CT images using intensity, shape and texture features. J Ambient Intell Humaniz Comput 10:4135–4149

    Google Scholar 

  • Shen D, Guorong Wu, Suk H-I (2017) Deep learning in medical image analysis. Annu Rev Biomed Eng 19:221–248

    Google Scholar 

  • Shen D, Guoyin W, Wenlin W, Martin RM, Qinliang S, Yizhe Z, Chunyuan L, Ricardo H, Lawrence C. (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms, arXiv preprint arXiv:1805.09843

  • Simonyan K, Andrew Z. (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556

  • Sujitha R, Seenivasagam V (2021) Classification of lung cancer stages with machine learning over big data healthcare framework. J Ambient Intell Humaniz Comput 12:5639–5649

    Google Scholar 

  • Sun Yi (2015) Deep learning face representation by joint identification-verification. The Chinese University of Hong Kong, Hong Kong

    Google Scholar 

  • Surya V, Senthilselvi A (2020) A qualitative analysis of the machine learning methods in food adultery: a focus on Milk adulteration detection. J Adv Res Dyn Control Syst 12(7):543–551. https://doi.org/10.5373/JARDCS/V12I7/20202037

    Article  Google Scholar 

  • Svensén M, Christopher MB (2007) Pattern recognition and machine learning. Springer, Berlin/Heidelberg, Germany

    Google Scholar 

  • Szegedy C, Wei L, Yangqing J, Pierre S, Scott R, Dragomir A, Dumitru E, Vincent V, Andrew R. (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1–9

  • Szegedy C, Vincent V, Sergey I, Jon S, Zbigniew W. (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2818–26

  • Thaha MM, Pradeep Mohan Kumar K, Murugan BS, Dhanasekeran S, Vijayakarthick P, Senthil A, Selvi. (2019) Brain tumor segmentation using convolutional neural networks in MRI images. J Med Syst 43:1–10

    Google Scholar 

  • Tomczak K, Czerwińska P, Wiznerowicz M (2015) The cancer genome atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19:A68

    Google Scholar 

  • Varsamopoulos S, Bertels K, Almudever CG (2019) Comparing neural network based decoders for the surface code. IEEE Trans Comput 69:300–311

    MathSciNet  MATH  Google Scholar 

  • Wang J, Lin J, Wang Z (2017) Efficient hardware architectures for deep convolutional neural network. IEEE Trans Circuits Syst I Regul Pap 65:1941–1953

    Google Scholar 

  • Winnepenninckx V, Lazar V, Michiels S, Dessen P, Stas M, Alonso SR, Avril M-F, Ortiz PL, Romero TR, Balacescu O (2006) Gene expression profiling of primary cutaneous melanoma and clinical outcome. J Natl Cancer Inst 98:472–482

    Google Scholar 

  • Xu J, Luo X, Wang G, Gilmore H, Madabhushi A (2016) A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing 191:214–223

    Google Scholar 

  • Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9:611–629

    Google Scholar 

  • Yang Z, Ran L, Zhang S, Xia Y, Zhang Y (2019) EMS-net: ensemble of multiscale convolutional neural networks for classification of breast cancer histology images. Neurocomputing 366:46–53

    Google Scholar 

  • Yuan Y, Shi Yi, Li C, Kim J, Cai W, Han Z, Feng DD (2016) DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations. BMC Bioinform 17:243–256

    Google Scholar 

  • Zhu W, Chaochun L, Wei F, Xiaohui X. (2018) Deeplung: deep 3d dual path nets for automated pulmonary nodule detection and classification. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 673–81. IEEE

Download references

Acknowledgements

HAR has been supported by UNSW Scientia Program Fellowship. Analysis was made possible with computational resources provided by the BioMedical Machine Learning high performance computing Server with funding from the Australian Government and the UNSW SYDNEY.

Author information

Authors and Affiliations

Authors

Contributions

HAR and PP designed the study; PP, MF, MR designed the models. PP, HAR, MF wrote the paper. HAR, MF, MR, and PP edited the manuscript. PP carried out all the analyses, including the statistical analyses, model developments, comparision, etc. PP generated all figures and all tables. All authors have read and approved the final version of the paper.

Corresponding author

Correspondence to Mansoor Fateh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 641 KB)

Supplementary file2 (RAR 2175 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parhami, P., Fateh, M., Rezvani, M. et al. A comparison of deep neural network models for cluster cancer patients through somatic point mutations. J Ambient Intell Human Comput 14, 10883–10898 (2023). https://doi.org/10.1007/s12652-022-04351-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-04351-5

Keywords