A comparison of deep neural network models for cluster cancer patients through somatic point mutations

Parhami, Pouria; Fateh, Mansoor; Rezvani, Mohsen; Alinejad-Rokny, Hamid

doi:10.1007/s12652-022-04351-5

A comparison of deep neural network models for cluster cancer patients through somatic point mutations

Original Research
Published: 26 August 2022

Volume 14, pages 10883–10898, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Pouria Parhami¹,
Mansoor Fateh ORCID: orcid.org/0000-0003-2133-3480¹,
Mohsen Rezvani¹ &
…
Hamid Alinejad-Rokny²

443 Accesses
1 Altmetric
Explore all metrics

Abstract

It is now well-known that genetic mutations contribute to development of tumors, in which at least 15% of cancer patients experience a causative genetic abnormality including De Novo somatic point mutations. This highlights the importance of identifying responsible mutations and the associated biomarkers (e.g., genes) for early detection in high-risk cancer patients. The next-generation sequencing technologies have provided an excellent opportunity for researchers to study associations between De Novo somatic mutations and cancer progression by identifying cancer subtypes and subtype-specific biomarkers. Simple linear classification models have been used for somatic point mutation-based cancer classification (SMCC); however, because of cancer genetic heterogeneity (ranging from 50 to 80%), high data sparsity, and the small number of cancer samples, the simple linear classifiers resulted in poor cancer subtypes classification. In this study, we have evaluated three advanced deep neural network-based classifiers to find and optimized the best model for cancer subtyping. To address the above-mentioned complexity, we have used pre-processing clustered gene filtering (CGF) and indexed sparsity reduction (ISR), regularization methods, a Global-Max-Pooling layer, and an embedding layer. We have evaluated and optimized the three deep learning models CNN, LSTM, and a hybrid model of CNN + LSTM on publicly available TCGA-DeepGene dataset, a re-formulated subset of The Cancer Genome Atlas (TCGA) dataset and tested the performance measurement of these models is 10-cross-validation accuracy. Evaluating all the three models using a same criterion on the test dataset revealed that the CNN, LSTM, and CNN + LSTM have 66.45% accuracy, 40.89% accuracy, and 41.20% accuracy in somatic point mutation-based cancer classification. Based on our results, we propose the CNN model for further experiments on cancer subtyping based on DNA mutations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations

Article Open access 23 December 2016

mClass: Cancer Type Classification with Somatic Point Mutation Data

Deep learning for cancer type classification and driver gene identification

Article Open access 25 October 2021

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The dataset analyzed during the current study are available in the DeepGene repository, https://github.com/yuanyc06/deepgene.

References

Aarøe J, Lindahl T, Dumeaux V, Sæbø S, Tobin D, Hagen N, Skaane P, Lönneborg A, Sharma P, Børresen-Dale A-L (2010) Gene expression profiling of peripheral blood cells for early detection of breast cancer. Breast Cancer Res 12:1–11
Google Scholar
Abdel-Hamid O, Mohamed A-R, Jiang H, Deng Li, Penn G, Dong Yu (2014) Convolutional neural networks for speech recognition. IEEE/ACM Trans on Audio, Speech, Lang Process 22:1533–1545
Google Scholar
Alinejad-Rokny H, Anwar F, Waters SA, Davenport MP, Ebrahimi D (2016) Source of CpG depletion in the HIV-1 genome. Mol Biol Evol 33:3205–3212
Google Scholar
Alinejad-Rokny H, Ghavami R, Rabiee HR, Rezaei N, Tam KT, Forrest AR (2020) MaxHiC: robust estimation of chromatin interaction frequency in Hi-C and capture Hi-C experiments. bioRxiv 2020(8):15454
Google Scholar
Asrol M, Papilo P, Gunawan FE (2021) Support vector machine with K-fold validation to improve the industry’s sustainability performance classification. Procedia Computer Sci 179:854–862
Google Scholar
Balss J, Meyer J, Mueller W, Korshunov A, Hartmann C, von Deimling A (2008) Analysis of the IDH1 codon 132 mutation in brain tumors. Acta Neuropathol 116:597–602
Google Scholar
Bayati M, Rabiee HR, Mehrbod M, Vafaee F, Ebrahimi D, Forrest AR, Alinejad-Rokny H (2020) CANCERSIGN: a user-friendly and robust tool for identification and classification of mutational signatures and patterns in cancer genomes. Sci Rep 10:1–11
Google Scholar
Browne RP, McNicholas PD, Sparling MD (2011) Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Trans Pattern Anal Mach Intell 34:814–817
Google Scholar
Cai Z, Lizhe X, Yi S, Mohammad RS, Randy G, Guohui L. (2006) Using gene clustering to identify discriminatory genes with higher classification accuracy. In Sixth IEEE Symposium on BioInform and BioEng (BIBE'06), 235–42. IEEE
Chanu MM, Thongam K (2021) Computer-aided detection of brain tumor from magnetic resonance images using deep learning network. J Ambient Intell Humaniz Comput 12:6911–6922
Google Scholar
Cheng J-Z, Ni D, Chou Y-H, Qin J, Tiu C-M, Chang Y-C, Huang C-S, Shen D, Chen C-M (2016) Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep 6:1–13
Google Scholar
Cho J-H, Lee D, Park JH, Lee I-B (2003) New gene selection method for classification of cancer subtypes considering within-class variation. FEBS Lett 551:3–7
Google Scholar
Chow, Chi K, Hailong Z, Jessica L, Mark WL, Winston PK, Keith C. (2009) A cooperative feature gene extraction algorithm that combines classification and clustering. In 2009 IEEE Int Conf on Bioinform and Biomed Workshop, 197–202. IEEE
Ciregan D, Ueli M, Jürgen S. (2012) Multi-column deep neural networks for image classification. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, 3642–49. IEEE
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
MATH Google Scholar
Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, Moreira AL, Razavian N, Tsirigos A (2018) Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med 24:1559–1567
Google Scholar
Cruz-Roa A, Gilmore H, Basavanhally A, Feldman M, Ganesan S, Shih NNC, Tomaszewski J, González FA, Madabhushi A (2017) Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent. Sci Rep 7:1–14
Google Scholar
Dashti H, Dehzangi A, Bayati M, Breen J, Lovell N, Ebrahimi D, Alinejad-Rokny H. (2020) Integrative analysis of mutated genes and mutational processes reveals seven colorectal cancer subtypes. bioRxiv
Deepak S, Ameer PM (2021) Automated categorization of brain tumor from mri using cnn features and svm. J Ambient Intell Humaniz Comput 12:8357–8369
Google Scholar
Donahue J, Yangqing J, Oriol V, Judy H, Ning Z, Eric T, Trevor D. (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In Int Conf on Mach Learn, 647–55. PMLR
Ebrahimi D, Alinejad-Rokny H, Davenport MP (2014) Insights into the motif preference of APOBEC3 enzymes. PLoS One 9:e87679
Google Scholar
Edara DC, Lakshmi PV, Venkatramaphanikumar S, Venkata KKK (2019) Sentiment analysis and text categorization of cancer medical records with LSTM. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01399-8
Article Google Scholar
Fateh A, Fateh M, Abolghasemi V (2021) Multilingual handwritten numeral recognition using a robust deep network joint with transfer learning. Inf Sci 581:479–494
MathSciNet Google Scholar
Ferlay J, Ervik M, Lam F, Colombet M, Mery L, Piñeros M, Znaor A, Soerjomataram I, Bray F. (2020) Global cancer observatory: cancer today. Lyon: Int Agency Res Cancer; 2018
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12:2451–2471
Google Scholar
Ghareyazi A, Mohseni A, Dashti H, Beheshti A, Dehzangi A, Rabiee HR, Alinejad-Rokny H (2021) Whole-genome analysis of de novo somatic point mutations reveals novel mutational biomarkers in pancreatic cancer. Cancers 13:4376
Google Scholar
Gong L, Wang C, Li Xi, Chen H, Zhou X (2018) MALOC: a fully pipelined FPGA accelerator for convolutional neural networks with all layers mapped on chip. IEEE Trans Comput Aided Des Integr Circuits Syst 37:2601–2612
Google Scholar
Gooneratne SL, Alinejad-Rokny H, Ebrahimi D, Bohn PS, Wiseman RW, O’Connor DH, Kent SJ (2014) Linking pig-tailed macaque major histocompatibility complex class I haplotypes and cytotoxic T lymphocyte escape mutations in simian immunodeficiency virus infection. J Virol 88:14310–14325
Google Scholar
Habibi M, Weber L, Neves M, Wiegandt DL, Leser U (2017) Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics 33:i37–i48
Google Scholar
He K, Xiangyu Z, Shaoqing R, Jian S. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–78
Heidari R, Akbariqomi M, Asgari Y, Ebrahimi D, Alinejad-Rokny H (2021) A systematic review of long non-coding RNAs with a potential role in breast cancer. Mutat Res/Rev Mutat Res 787:108375
Google Scholar
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
MathSciNet MATH Google Scholar
Hinton GE, Nitish S, Alex K, Ilya S, Ruslan RS. (2012) Improving neural networks by preventing co-adaptation of feature detectors, arXiv preprint arXiv:1207.0580
Hosseinpoor M, Parvin H, Nejatian S, Rezaie V, Bagherifard K, Dehzangi A, Alinejad-Rokny H (2020) Proposing a novel community detection approach to identify cointeracting genomic regions. Math Biosci Eng 17:2193–2217
MATH Google Scholar
Huang J, Vivek R, Chen S, Menglong Z, Anoop K, Alireza F, Ian F, Zbigniew W, Yang S, Sergio G. (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7310–11
Huang Z, Huang D, Ni S, Peng Z, Sheng W, Xiang Du (2010) Plasma microRNAs are promising novel biomarkers for early detection of colorectal cancer. Int J Cancer 127:118–126
Google Scholar
Inan O, Uzer MS (2021) A method of classification performance improvement via a strategy of clustering-based data elimination integrated with k-Fold cross-validation. Arab J Sci Eng 46:1199–1212
Google Scholar
Jalali Y, Fateh M, Rezvani M, Abolghasemi V, Anisi MH (2021) ResBCDU-Net: a deep learning framework for lung CT image segmentation. Sensors 21:268
Google Scholar
Javanmard R, JeddiSaravi K, Alinejad-Rokny H (2013) Proposed a new method for rules extraction using artificial neural network and artificial immune system in cancer diagnosis. J Bionanosci 7:665–672
Google Scholar
Jia AD, Zhengyi Li B, Chuanwang C, Zhang. (2020) Detection of cervical cancer cells based on strong feature CNN-SVM network. Neurocomputing 411:112–127
Google Scholar
Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Anthony T (2018) Computational intelligence approaches for classification of medical data: state-of-the-art, future challenges and research directions. Neurocomputing 276:2–22 (%J Neurocomputing Chronopoulos)
Google Scholar
Khan SU, Islam N, Jan Z, Din IU, Rodrigues JJPC (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recogn Lett 125:1–6
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Kurman RJ, Kala V, Richard R, Wu TC, Ie-Ming S (2008) Early detection and treatment of ovarian cancer: shifting from early stage to minimal volume of disease based on a new model of carcinogenesis. Am J Obstetrics Gyneco 198:351–56
Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Google Scholar
Liang G, Hong H, Xie W, Zheng L (2018) Combining convolutional neural network with recursive neural network for blood cell image classification. IEEE Access 6:36188–36197
Google Scholar
Lin M, Qiang C, Shuicheng Y. (2013) Network in network, arXiv preprint arXiv:1312.4400
Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26
Google Scholar
Nguyen LD, Ruihan G, Dongyun L, Zhiping L (2019) Biomedical image classification based on a feature concatenation and ensemble of deep CNNs. J Ambient Intell Human Comput 10:1–13. https://doi.org/10.1007/s12652-019-01276-4
Article Google Scholar
Niu H, Khozouie N, Parvin H, Alinejad-Rokny H, Beheshti A, Mahmoudi MR (2020) An ensemble of locally reliable cluster solutions. Appl Sci 10:1891
Google Scholar
Parvin H, Alinejad-Rokny H, Minaei-Bidgoli B (2011a) Detection of cancer patients using an innovative method for learning at imbalanced datasets. International conference on rough sets and knowledge technology. Springer, Berlin Heidelberg, pp 376–381
Google Scholar
Parvin H, Minaei B, Alizadeh H, Beigi A (2011b) A novel classifier ensemble method based on class weightening in huge dataset. In international symposium on neural networks. Springer, Heidelberg, pp 144–150
Google Scholar
Parvin H, MirnabiBaboli M, Alinejad-Rokny H (2015) Proposing a classifier ensemble framework based on classifier selection and decision tree. Eng Appl Artif Intell 37:34–42
Google Scholar
Qaiser T, Tsang Y-W, Epstein D, Rajpoot N (2017) Tumor segmentation in whole slide images using persistent homology and deep convolutional features. Annual conference on medical image understanding and analysis. Springer, Heidelberg, pp 320–329
Google Scholar
Rajaei P, Jahanian KH, Beheshti A, Band SS, Dehzangi A, Alinejad-Rokny H (2021) VIRMOTIF: A user-friendly tool for viral sequence analysis. Genes 12:186
Google Scholar
Renith G, Senthilselvi A (2020) Accuracy improvement in diabetic retinopathy detection using DLIA. J Adv Res Dyn Control Syst 12(4):133–149. https://doi.org/10.5373/JARDCS/V12I4/20201426
Article Google Scholar
Sankareswaran SP, Krishnan M (2022) Unsupervised end-to-end brain tumor magnetic resonance image registration using RBCNN: rigid transformation, B-spline transformation and convolutional neural network. Curr Med Imaging 18(4):387–397
Google Scholar
Shamshirband S, Mahdis F, Abdollah D, Anthony TC, Hamid A-R (2021) A review on deep learning approaches in healthcare systems: taxonomies, challenges, and open issues. J Biomed Inform 113:103627
Google Scholar
Sharif RA, Hossein A, Josephine S, Stefan C. (2014) CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 806–13
Sharifrazi D, Alizadehsani R, Joloudari JH, Shamshirband S, Hussain S, Sani ZA, Alinejad-Rokny H. (2020) CNN-KCL: automatic myocarditis diagnosis using convolutional neural network combined with k-means clustering, preprints, 2020
Shaukat F, Raja G, Ashraf R, Khalid S, Ahmad M, Ali A (2019) Artificial neural network based classification of lung nodules in CT images using intensity, shape and texture features. J Ambient Intell Humaniz Comput 10:4135–4149
Google Scholar
Shen D, Guorong Wu, Suk H-I (2017) Deep learning in medical image analysis. Annu Rev Biomed Eng 19:221–248
Google Scholar
Shen D, Guoyin W, Wenlin W, Martin RM, Qinliang S, Yizhe Z, Chunyuan L, Ricardo H, Lawrence C. (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms, arXiv preprint arXiv:1805.09843
Simonyan K, Andrew Z. (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Sujitha R, Seenivasagam V (2021) Classification of lung cancer stages with machine learning over big data healthcare framework. J Ambient Intell Humaniz Comput 12:5639–5649
Google Scholar
Sun Yi (2015) Deep learning face representation by joint identification-verification. The Chinese University of Hong Kong, Hong Kong
Google Scholar
Surya V, Senthilselvi A (2020) A qualitative analysis of the machine learning methods in food adultery: a focus on Milk adulteration detection. J Adv Res Dyn Control Syst 12(7):543–551. https://doi.org/10.5373/JARDCS/V12I7/20202037
Article Google Scholar
Svensén M, Christopher MB (2007) Pattern recognition and machine learning. Springer, Berlin/Heidelberg, Germany
Google Scholar
Szegedy C, Wei L, Yangqing J, Pierre S, Scott R, Dragomir A, Dumitru E, Vincent V, Andrew R. (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1–9
Szegedy C, Vincent V, Sergey I, Jon S, Zbigniew W. (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2818–26
Thaha MM, Pradeep Mohan Kumar K, Murugan BS, Dhanasekeran S, Vijayakarthick P, Senthil A, Selvi. (2019) Brain tumor segmentation using convolutional neural networks in MRI images. J Med Syst 43:1–10
Google Scholar
Tomczak K, Czerwińska P, Wiznerowicz M (2015) The cancer genome atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19:A68
Google Scholar
Varsamopoulos S, Bertels K, Almudever CG (2019) Comparing neural network based decoders for the surface code. IEEE Trans Comput 69:300–311
MathSciNet MATH Google Scholar
Wang J, Lin J, Wang Z (2017) Efficient hardware architectures for deep convolutional neural network. IEEE Trans Circuits Syst I Regul Pap 65:1941–1953
Google Scholar
Winnepenninckx V, Lazar V, Michiels S, Dessen P, Stas M, Alonso SR, Avril M-F, Ortiz PL, Romero TR, Balacescu O (2006) Gene expression profiling of primary cutaneous melanoma and clinical outcome. J Natl Cancer Inst 98:472–482
Google Scholar
Xu J, Luo X, Wang G, Gilmore H, Madabhushi A (2016) A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing 191:214–223
Google Scholar
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9:611–629
Google Scholar
Yang Z, Ran L, Zhang S, Xia Y, Zhang Y (2019) EMS-net: ensemble of multiscale convolutional neural networks for classification of breast cancer histology images. Neurocomputing 366:46–53
Google Scholar
Yuan Y, Shi Yi, Li C, Kim J, Cai W, Han Z, Feng DD (2016) DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations. BMC Bioinform 17:243–256
Google Scholar
Zhu W, Chaochun L, Wei F, Xiaohui X. (2018) Deeplung: deep 3d dual path nets for automated pulmonary nodule detection and classification. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 673–81. IEEE

Download references

Acknowledgements

HAR has been supported by UNSW Scientia Program Fellowship. Analysis was made possible with computational resources provided by the BioMedical Machine Learning high performance computing Server with funding from the Australian Government and the UNSW SYDNEY.

Author information

Authors and Affiliations

Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran
Pouria Parhami, Mansoor Fateh & Mohsen Rezvani
BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW SYDNEY, Sydney, NSW, 2052, Australia
Hamid Alinejad-Rokny

Authors

Pouria Parhami
View author publications
You can also search for this author inPubMed Google Scholar
Mansoor Fateh
View author publications
You can also search for this author inPubMed Google Scholar
Mohsen Rezvani
View author publications
You can also search for this author inPubMed Google Scholar
Hamid Alinejad-Rokny
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

HAR and PP designed the study; PP, MF, MR designed the models. PP, HAR, MF wrote the paper. HAR, MF, MR, and PP edited the manuscript. PP carried out all the analyses, including the statistical analyses, model developments, comparision, etc. PP generated all figures and all tables. All authors have read and approved the final version of the paper.

Corresponding author

Correspondence to Mansoor Fateh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 641 KB)

Supplementary file2 (RAR 2175 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Parhami, P., Fateh, M., Rezvani, M. et al. A comparison of deep neural network models for cluster cancer patients through somatic point mutations. J Ambient Intell Human Comput 14, 10883–10898 (2023). https://doi.org/10.1007/s12652-022-04351-5

Download citation

Received: 30 June 2021
Accepted: 19 July 2022
Published: 26 August 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s12652-022-04351-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of deep neural network models for cluster cancer patients through somatic point mutations

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations

mClass: Cancer Type Classification with Somatic Point Mutation Data

Deep learning for cancer type classification and driver gene identification

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 641 KB)

Supplementary file2 (RAR 2175 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now