Abstract:
For DNA sequence Compression, it has been observed that methods based on Markov modeling and repeats give best results. However, these methods tend to use uniform distrib...Show MoreMetadata
Abstract:
For DNA sequence Compression, it has been observed that methods based on Markov modeling and repeats give best results. However, these methods tend to use uniform distribution assumption of mismatches for approximate repeats. We show that these replacements are not uniformly distributed and we can improve compression efficiency by using non uniform distribution for mismatches. We also propose a hash table based method to predict repeat location which works well for block based genomic sequence compression algorithms. The proposed methods give good compression gains. The method can be incorporated into any algorithm that uses approximate repeats to realize similar gains.
Published in: 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)
Date of Conference: 18-18 December 2010
Date Added to IEEE Xplore: 28 January 2011
ISBN Information: