Abstract
This paper considers the use of machine learning for diagnosis of diseases that is based on the analysis of a complete gene expression profile. This distinguishes our study from other approaches that require a preliminary step of finding a limited number of relevant genes (tens or hundreds of genes). We conducted experiments with complete genetic expression profiles (20 531 genes) that we obtained after processing transcriptomes of 801 patients with known oncologic diagnoses (oncology of the lung, kidneys, breast, prostate, and colon). Using the indextron (instant learning index system) for a new purpose, i.e., for complete expression profile processing, provided diagnostic accuracy that is 99.75% in agreement with the results of histological verification.
REFERENCES
Khan, J., Wei, J., Ringner, M., et al., Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks, Nat Med., 2001, vol. 7, pp. 673–679. https://doi.org/10.1038/89044
Kumar, A. and Halder, A., Greedy Fussy Vaguely Quantified Rough Approach for Cancer Relevant Gene Selection from Gene Expression Data, Soft Comput., 2022, vol. 26, pp. 13567–13581. https://doi.org/10.1007/s00500-022-07312-4
Houssein, E., Hassan, H., Al-Sayed, M.M., et al., Gene Selection for Microarray Cancer Classification based on Manta Rays Foraging Optimization and Support Vector Machines, Arabian Journal for Science and Engineering, 2022, vol. 47, pp. 2555–2572. https://doi.org/10.1007/s13369-021-06102-8
Zheng, Y., Sun, Y., Kuai, Y., et al., Gene Expression Profiling for the Diagnosis of Multiple Primary Malignant Tumors, Cancer Cell Int., 2021, vol. 21, p. 47. https://doi.org/10.1186/s12935-021-01748-8
Ye, Q., Wang, Q., Qi, P., et al., Development and Validation of a 90-Gene Real-Time PCR Assay for Tumor Origin Identification, Symposium MXW, 2018.
Joshi, P. and Dhar, R., EpICC: A Bayesian Neural Network Model with Uncertainty Correction for a More Accurate Classification of Cancer, Sci. Rep, 2022, vol. 12, p. 14628. https://doi.org/10.1038/s41598-022-18874-6
Steiling, K. and Christenson, S., Tools for Genetics and Genomics: Gene Expression Profiling, UpTo-Date, 2021. https://www.uptodate.com/contents/ tools-for-genetics-and-genomics-geneexpression-Profiling
St. Petersburg University Research Park. High-throughput complete genome sequencing system, 2023. https://researchpark.spbu.ru/equipment-biobank-rus/equipment-biobank-genom-rus/equipmentbiobank-ngsseq-rus/1762-biobank-hiseq-2500-sequencing-system-rus
IBM.What are neural networks? https://www.ibm.com/cloud/learn/neural-networks
Mikhailov, A. and Pok, Y.M., Artificial Neural Cortex, Smart Engineer. Syst. Design., ASME PRESS, New York, 2001, vol. 11, pp. 113–120.
Mikhailov, A. and Karavay, M., Pattern Inversion as a Pattern Recognition Method for Machine Learning, Cornell University, 2021. https://arxiv.org/abs/2108.10242
Brin, S. and Page, L., The Anatomy of a Large-Scale Hypertextual Web Search Engine, Comput. Networks ISDN Syst., 1998, vol. 30, nos. 1–7. Stanford University, Stanford, CA, 94305, USA. https://doi.org/10.1016/S069-7552(98)00110-X
Mikhailov, A., Indextron, Artificial Neural Networks in Engineering Conf. (ANNIE 1998), St. Louis, Missouri, Nov. 4–7, 1998. Proceedings Vol. 8: ANNIE 1998, Publisher: ASME Press, 1998.
Jones, K., A Statistical Interpretation of Term Specificity and Its Application in Retrieval, J. Document., 2004, vol. 60, no. 5. pp. 493–502.
Sivic, J. and Zisserman, A., Efficient Visual Search of Videos Cast as Text Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, vol. 31, no. 4. https://doi.org/10.1109/TPAMI.2008.111
UCI. Machine learning repository. https://archive.ics.uci.edu/ml/datasets/gene+expression+cancer+RNA-Seq
Mikhailov, A. and Karavay, M., Indextron, Proceedings of the 10th International Conference on Pattern Recognition Application and Methods, February 4–6, 2021, Vienna, V.1-978-989-758-486-2, pp. 143–149. https://doi.org/10.5220/0010180301430149
Author information
Authors and Affiliations
Corresponding authors
Additional information
This paper was recommended for publication by O.N. Granichin, a member of the Editorial Board
Rights and permissions
About this article
Cite this article
Mikhailov, A.M., Karavai, M.F., Sivtsov, V.A. et al. Machine Learning for Diagnosis of Diseases with Complete Gene Expression Profile. Autom Remote Control 84, 727–733 (2023). https://doi.org/10.1134/S0005117923070093
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0005117923070093