Abstract:
In Genome-Wide Association Studies (GWAS) huge amounts of genetic information are analyzed in order to discover how the observed variations, more specifically, the Single...Show MoreMetadata
Abstract:
In Genome-Wide Association Studies (GWAS) huge amounts of genetic information are analyzed in order to discover how the observed variations, more specifically, the Single Nucleotide Polymorphisms (SNPs), are related with a certain trait of interest, such as the susceptibility for a disease. However, the high dimensionality observed in the datasets imposes significant challenges for methods that try to identify the relevant SNPs and their interactions. In particular, we emphasize the challenges imposed by the great amount of irrelevant dimensions shadowing information which is object of study. In this work, we present a prototype-based classification method, derived from Learning Vector Quantization (LVQ), in which the relevance of each input dimension is learned independently for each prototype. We validate our method in simulated datasets of GWAS with a significant number of dimensions (20, 50, or 100) in which few of them (from 2 to 5) are relevant. Such dimensions have to be identified. The proposed method presented promising results, showing graceful degradation when the number of irrelevant dimensions increases, in comparison with Multifactor Dimensionality Reduction (MDR), Generalized Relevance Learning Vector Quantization (GRLVQ) and Supervised Relevance Neural Gas (SRNG).
Date of Conference: 04-09 August 2013
Date Added to IEEE Xplore: 09 January 2014
ISBN Information: