Abstract
This study aims to develop new classifiers that can effectively integrate and analyze biomedical data obtained from various sources through high-throughput technologies. The use of explainable models is particularly important as they offer insights into the relationships and patterns within the data, which leads to a better understanding of the underlying processes.
The objective of this research is to examine the effectiveness of decision trees combined with Relative eXpression Analysis (RXA) for classifying multi-omics data. Several concepts for integrating separated data are verified, based on different pair relationships between the features. Within the study, we propose a multi-test approach that combines linked top-scoring pairs from different omics in each internal node of the hierarchical classification model. To address the significant computational challenges raised by RXA, the most time-consuming aspects are parallelized using a GPU. The proposed solution was experimentally validated using single and multi-omics datasets. The results show that the proposed concept generates more accurate and interpretable predictions than commonly used tree-based solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, X., Wang, M., Zhang, H.: The use of classification trees in bioinformatics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge 55–63 (2011)
Cohen, WW.: Fast effective rule induction. In: ICML95, San Francisco, CA, USA, pp. 115–123. Morgan Kaufmann (1995)
Czajkowski, M., Kretowski, M.: Top scoring pair decision tree for gene expression data analysis. Adv. Exp. Med. Biol. 696, 27–35 (2011)
Czajkowski, M., Kretowski, M.: Decision tree underfitting in mining of gene expression data. An evolutionary multi-test tree approach. Expert Syst. Appl. 137, 392–404 (2019)
Czajkowski, M., Jurczuk, K., Kretowski, M.: Relative expression classification tree. A preliminary GPU-based implementation. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds.) PPAM 2019. LNCS, vol. 12043, pp. 359–369. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43229-4_31
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Eddy, J.A., Sung, J., Geman, D., Price, N.D.: Relative expression analysis for molecular cancer diagnosis and prognosis. Technol. Cancer Res. Treat. 9(2), 149–159 (2010)
Frank, E., Hall, M.A., Witten, I.H.: The WEKA Workbench. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2016)
Geman, D., d’Avignon, C., Naiman, DQ., Winslow, RL.: Classifying gene expression profiles from pairwise mRNA comparisons. Stat. Appl. Genet. Mol. Biol. 3(19) (2004)
Huang, S., Chaudhary, K., Garmire, L.X.: More is better: recent progress in multi-omics data integration methods. Front. Genet. 8 (2017)
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)
Multi-Omics Cancer Benchmark TCGA Preprocessed Data repository. http://acgt.cs.tau.ac.il/multiomic_benchmark/download.html
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Wu, C., Zhou, F., et al.: A selective review of multi-level omics data integration using variable selection. High-Throughput 8(1), 4 (2019)
Acknowledgments
This project was funded by the Polish National Science Centre and allocated on the basis of decision 2019/33/B/ST6/02386 from BUT founded by Polish Ministry of Science and Higher Education (first and second author). The third author was supported by the grant WZ/WI-IIT/4/2023 from BUT founded by Polish Ministry of Science and Higher Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Czajkowski, M., Jurczuk, K., Kretowski, M. (2023). Hierarchical Relative Expression Analysis in Multi-omics Data Classification. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham. https://doi.org/10.1007/978-3-031-36021-3_69
Download citation
DOI: https://doi.org/10.1007/978-3-031-36021-3_69
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36020-6
Online ISBN: 978-3-031-36021-3
eBook Packages: Computer ScienceComputer Science (R0)