Abstract:
Advancement in the DNA sequencing machinery has led to tremendous accumulation of sequence data. This has encouraged researchers to develop more robust analysis methods. ...Show MoreMetadata
Abstract:
Advancement in the DNA sequencing machinery has led to tremendous accumulation of sequence data. This has encouraged researchers to develop more robust analysis methods. Promoter sequences are an important part of these DNA sequences which take part in gene expression and regulation. Here, we attempt to perform alignment free variance based feature selection on Position Specific Motif Matrices of promoter sequences. Then, we analyze similarity/differences existing in these sequences using the cumulative distribution of the selected features/motifs. To demonstrate the efficacy of the proposed technique we use promoter data from NCBI database. The similarity/dissimilarity values get enhanced when we use only the selected features/motifs instead of all the features. Hence, the combination of feature selection and analysis using cumulative distribution of motifs proposed for promoter sequence analysis has the potential to enhance the similarity/differences that exist between promoter sequences.
Published in: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)
Date of Conference: 21-24 September 2016
Date Added to IEEE Xplore: 03 November 2016
ISBN Information: