Conferences >2010 IEEE International Confe...

Algorithm for DNA sequence compression based on prediction of mismatch bases and repeat location

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

For DNA sequence Compression, it has been observed that methods based on Markov modeling and repeats give best results. However, these methods tend to use uniform distrib...Show More

Metadata

Abstract:

For DNA sequence Compression, it has been observed that methods based on Markov modeling and repeats give best results. However, these methods tend to use uniform distribution assumption of mismatches for approximate repeats. We show that these replacements are not uniformly distributed and we can improve compression efficiency by using non uniform distribution for mismatches. We also propose a hash table based method to predict repeat location which works well for block based genomic sequence compression algorithms. The proposed methods give good compression gains. The method can be incorporated into any algorithm that uses approximate repeats to realize similar gains.

Published in: 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)

Date of Conference: 18-18 December 2010

Date Added to IEEE Xplore: 28 January 2011

ISBN Information:

DOI: 10.1109/BIBMW.2010.5703941

Conference Location: Hong Kong, China