Abstract
A gapped repeat is a factor of the form uvu where u and v are nonempty words. The period of the gapped repeat is defined as |u| + |v|. The gapped repeat is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its period. The gapped repeat is called α-gapped if its period is not greater than α|u|. A δ-subrepetition is a factor which exponent is less than 2 but is not less than 1 + δ (the exponent of the factor is the quotient of the length and the minimal period of the factor). The δ-subrepetition is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its minimal period. We obtain that in a word of length n the number of maximal α-gapped repeats is bounded by O(α 2 n) and the number of maximal δ-subrepetitions is bounded by O(n/δ 2). Using the obtained upper bounds, we propose algorithms for finding all maximal α-gapped repeats and all maximal δ-subrepetitions in a word of length n. The algorithm for finding all maximal α-gapped repeats has O(α 2 n) time complexity for the case of constant alphabet size and O(nlogn + α 2 n) time complexity for the general case. For finding all maximal δ-subrepetitions we propose two algorithms. The first algorithm has \(O(\frac{n\log\log n}{\delta^2})\) time complexity for the case of constant alphabet size and \(O(n\log n +\frac{n\log\log n}{\delta^2})\) time complexity for the general case. The second algorithm has \(O(n\log n+\frac{n}{\delta^2}\log \frac{1}{\delta})\) expected time complexity.
This work is partially supported by Russian Foundation for Fundamental Research (Grant 12-07-00216).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brodal, G., Lyngso, R., Pedersen, C., Stoye, J.: Finding Maximal Pairs with Bounded Gap. J. of Discrete Algorithms 1(1), 77–104 (2000)
Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Information Processing Letters 12, 244–250 (1981)
Crochemore, M., Rytter, W.: Squares, cubes, and time-space efficient string searching. Algorithmica 13, 405–425 (1995)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press (2007)
Crochemore, M., Ilie, L., Tinta, L.: Towards a solution to the “runs” conjecture. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 290–302. Springer, Heidelberg (2008)
Crochemore, M., Iliopoulos, C., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: Extracting powers and periods in a string from its runs structure. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 258–269. Springer, Heidelberg (2010)
van Emde Boas, P., Kaas, R., Zulstra, E.: Design and Implementation of an Efficient Priority Queue. Mathematical Systems Theory 10, 99–127 (1977)
Galil, Z., Seiferas, J.: Time-space optimal string matching. J. of Computer and System Sciences 26(3), 280–294 (1983)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)
Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. of Computer and System Sciences 69(4), 525–546 (2004)
Kociumaka, T., Radoszewski, J., Rytter, W., Waleń, T.: Efficient Data Structures for the Factor Periodicity Problem. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 284–294. Springer, Heidelberg (2012)
Kolpakov, R., Kucherov, G.: On Maximal Repetitions in Words. J. of Discrete Algorithms 1(1), 159–186 (2000)
Kolpakov, R., Kucherov, G.: Finding Repeats with Fixed Gap. In: 7th International Symposium on String Processing and Information Retrieval (SPIRE 2000), pp. 162–168 (2000)
Kolpakov, R., Kucherov, G.: Periodic structures in words. Chapter for the 3rd Lothaire volume Applied Combinatorics on Words. Cambridge University Press (2005)
Kolpakov, R., Kucherov, G., Ochem, P.: On maximal repetitions of arbitrary exponent. Information Processing Letters 110(7), 252–256 (2010)
Kolpakov, R.: On primary and secondary repetitions in words. Theoretical Computer Science 418, 71–81 (2012)
Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N.: Searching of gapped repeats and subrepetitions in a word, http://arxiv.org/abs/1309.4055
Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley (1983)
Storer, J.: Data compression: Methods and theory. Computer Science Press, Rockville (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N. (2014). Searching of Gapped Repeats and Subrepetitions in a Word. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-07566-2_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07565-5
Online ISBN: 978-3-319-07566-2
eBook Packages: Computer ScienceComputer Science (R0)