Years and Authors of Summarized Original Work
2001; Landau, Schmidt, Sokol
2003; Kolpakov, Kucherov
Problem Definition
Identification of periodic structures in words (variants of which are known as tandem repeats, repetitions, powers, or runs) is a fundamental algorithmic task (see entry Squares and Repetitions). In many practical applications, such as DNA sequence analysis, considered repetitions admit a certain variation between copies of the repeated pattern. In other words, repetitions under interest are approximate tandem repeats and not necessarily exact repeats only.
The simplest instance of an approximate tandem repeat is an approximate square. An approximate square in a word w is a subword uv, where u and v are within a given distance kaccording to some distance measure between words, such as Hamming distance or edit (also called Levenshtein) distance. There are several ways to define approximate tandem repeats as successions of approximate squares, i.e., to generalize to...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580
Boeva VA, Régnier M, Makeev VJ (2004) SWAN: searching for highly divergent tandem repeats in DNA sequences with the evaluation of their statistical significance. In: Proceedings of JOBIM 2004, Montreal, p 40
Butler JM (2001) Forensic DNA typing: biology and technology behind STR markers. Academic Press, San Diego
Crochemore M (1983) Recherche linéaire d’un carré dans un mot. C R Acad Sci Paris Sér I Math 296:781–784
Delgrange O, Rivals E (2004) STAR – an algorithm to search for tandem approximate repeats. Bioinformatics 20:2812–2820
Gelfand Y, Rodriguez A, Benson G (2007) TRDB – the tandem repeats database. Nucleic Acids Res 35(suppl. 1):D80–D87
Gusfield D (1997) Algorithms on strings, trees, and sequences. Cambridge University Press, Cambridge/New York
Kolpakov R, Kucherov G (1999) Finding maximal repetitions in a word in linear time. In: 40th symposium foundations of computer science (FOCS), New York, pp 596–604. IEEE Computer Society Press
Kolpakov R, Kucherov G (2003) Finding approximate repetitions under Hamming distance. Theor Comput Sci 33(1):135–156
Kolpakov R, Kucherov G (2005) Identification of periodic structures in words. In: Berstel J, Perrin D (eds) Applied combinatorics on words. Encyclopedia of mathematics and its applications. Lothaire books, vol 104, pp 430–477. Cambridge University Press, Cambridge
Kolpakov R, Bana G, Kucherov G (2003) mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res 31(13):3672–3678
Landau GM, Vishkin U (1988) Fast string matching with k differences. J Comput Syst Sci 37(1):63–78
Landau GM, Myers EW, Schmidt JP (1998) Incremental string comparison. SIAM J Comput 27(2):557–582
Landau GM, Schmidt JP, Sokol D (2001) An algorithm for approximate tandem repeats. J Comput Biol 8:1–18
Main M (1989) Detecting leftmost maximal periodicities. Discret Appl Math 25:145–153
Main M, Lorentz R (1984) An O(nlog n) algorithm for finding all repetitions in a string. J Algorithms 5(3):422–432
Messer PW, Arndt PF (2007) The majority of recent short DNA insertions in the human genome are tandem duplications. Mol Biol Evol 24(5):1190–1197
Rodeh M, Pratt V, Even S (1981) Linear algorithm for data compression via string matching. J Assoc Comput Mach 28(1):16–24
Sokol D, Benson G, Tojeira J (2006) Tandem repeats over the edit distance. Bioinformatics 23(2):e30–e35
Wexler Y, Yakhini Z, Kashi Y, Geiger D (2005) Finding approximate tandem repeats in genomic sequences. J Comput Biol 12(7):928–942
Acknowledgements
This work was supported in part by the National Science Foundation Grant DB&I 0542751.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Kucherov, G., Sokol, D. (2016). Approximate Tandem Repeats. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_24
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_24
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering