Abstract:
Gene sets have been widely used on genome-scale data for various purposes. Ideally, gene sets should have multiple scales that can explain biological processes in differe...Show MoreMetadata
Abstract:
Gene sets have been widely used on genome-scale data for various purposes. Ideally, gene sets should have multiple scales that can explain biological processes in different scales and depth, which is often missing from most popular algorithmically defined gene sets. We propose a principled way to generate multiscale gene sets based on protein-protein interaction (PPI) networks and techniques from multiscale harmonic analysis. Specifically, on a yeast PPI network, we adopt the diffusion wavelets tool developed by Coifman and Maggioni and modify it for heavy tail graphs. Then, we define gene sets through a tiling of the PPI network based on the scaling functions. We compare our multiscale gene sets to two standard gene set databases (GO and KEGG) and gene sets derived from a hierarchical clustering method. We find that our gene sets have a large, non-trivial overlap with the standard databases, and yet still have a sizeable non-overlap as well. In addition, the sense of scale from our gene sets also matches well with that from GO. Finally, we use yeast cell cycle experiments to demonstrate the potential usage of our multiscale gene sets.
Date of Conference: 03-05 December 2013
Date Added to IEEE Xplore: 13 February 2014
Electronic ISBN:978-1-4799-0248-4