TriGORank: A Gene Ontology Enriched Learning-to-Rank Framework for Trigenic Fitness Prediction
- Univ. of Illinois at Urbana-Champaign, IL (United States)
- Univ. of Illinois at Urbana-Champaign, IL (United States); Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)
Machine learning (ML) has been gaining interest in the metabolic engineering community as a means to automate prediction tasks. In this work, we introduce and study the task of using ML to recommend high-fitness triplet mutants as candidates for wet-lab experiments. We first utilize individual fitness and digenic fitness scores as features and train machine learning models that produce a ranked list, from high to low fitness scores, for triplet gene mutants of S. cerevisiae. Then, we incorporate prior metabolic knowledge from an existing gene ontology, by designing a novel graph representation and deducing features that can capture gene similarity and gene interactions. Lastly, experimental results show that our proposed gene ontology enriched model, termed TriGORank, improves both performance and explainability.
- Research Organization:
- Univ. of Illinois at Urbana-Champaign, IL (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- SC0018420
- OSTI ID:
- 1902720
- Journal Information:
- 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Conference: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX (United States), 9-12 Dec 2021
- Country of Publication:
- United States
- Language:
- English
Similar Records
PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction
Robust predictions of specialized metabolism genes through machine learning