Abstract:
With the ongoing advancements in science and technology and the increasing research focus on cancer-related issues, there has been a proliferation of omics-related resour...Show MoreMetadata
Abstract:
With the ongoing advancements in science and technology and the increasing research focus on cancer-related issues, there has been a proliferation of omics-related resources for in-depth analysis and exploration. This burgeoning volume and complexity of biological data have fostered the integration of machine-learning techniques into biology. As a result, numerous machine-learning strategies have been established to identify driver mutations. Yet, many of these strategies produce complex models, complicating comprehension and thereby clouding the impact of input features on the resulting predictions. Our analysis presented the CIXG framework, which integrates a driver gene prediction module using XGBoost with a causality interpretation module anchored on CXPlain. This architecture enables quantifying each input feature’s contribution to the prediction outcome and ensures precise predictions of driver genes. When benchmarked against the state-of-the-art (SOTA) method, CIXG demonstrated superior accuracy in pinpointing driver genes across pan-cancer studies and within the 32 specific cancer types. Importantly, our results underscored that mutation features chiefly influence CIXG’s predictive prowess, with additional support from other omics features.
Date of Conference: 03-06 December 2024
Date Added to IEEE Xplore: 10 January 2025
ISBN Information: