Abstract
The Longest Common Extension of a pair of positions (i, j) in a string, or word, is the longest substring starting at i and j. The LCE problem considers a word and a set of pairs of positions and computes for each pair in the set, the longest common extension starting at both positions in the pair. This problem finds applications in matching with don’t-care characters, approximate string searching, finding all exact or approximate tandem repeats, to name a few. From a practical point of view, Ilie et al. (Journal of Discrete Algorithms, 2010) looked for simple and efficient algorithms for the LCE problem. In this paper, we extend their analyses to partial words, strings with don’t-cares or holes. In this context, we compute the Longest Common Compatible Extension of each pair of positions (i, j) in a partial word, i.e., the longest substrings starting at i and j that are compatible. We show that our results match with those of total words (partial words without holes). We find that one of the simplest algorithms for implementing the LCE problem is optimal on average in this case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Berkman, O., Vishkin, U.: Recursive star-free parallel data structure. SIAM J. Comput. 22, 221–242 (1993)
Blanchet-Sadri, F., Lazarow, J.: Suffix trees for partial words and the longest common compatible prefix problem. In: Dediu, A.-H., Martín-Vide, C., Truthe, B. (eds.) LATA 2013. LNCS, vol. 7810, pp. 165–176. Springer, Heidelberg (2013)
Crochemore, M., Iliopoulos, C.S., Kociumaka, T., Kubica, M., Langiu, A., Radoszewski, J., Rytter, W., Szreder, B., Waleń, T.: A note on the longest common compatible prefix problem for partial words (2013). arxiv:1312.2381v1
Fischer, J., Heun, V.: Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 36–48. Springer, Heidelberg (2006)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)
Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69, 525–546 (2004)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)
Ilie, L., Navarro, G., Tinta, L.: The longest common extension problem revisited and applications to approximate string searching. J. Discrete Algorithms 8(4), 418–428 (2010)
Kärkkäinen, J., Sanders, P.: Simple linear work suffix array construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Berlin (2003)
Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)
Landau, G., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)
Landau, G., Vishkin, U.: Introducing efficient parallelism into approximate string matching and a new serial algorithm. In: STOC 1986, pp. 220–230. ACM Press (1986)
Landau, G., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10, 157–169 (1989)
Main, M.G., Lorentz, R.J.: An O(nlog n) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)
de Castro Miranda, R., Ayala-Rincón, M.: A modification of the Landau-Vishkin algorithm computing longest common extensions via suffix arrays. In: Setubal, J.C., Verjovski-Almeida, S. (eds.) BSB 2005. LNCS (LNBI), vol. 3594, pp. 210–213. Springer, Heidelberg (2005)
Myers, G.: An O(nd) difference algorithm and its variations. Algorithmica 1, 251–266 (1986)
Schieber, B., Vishkin, U.: On finding lowest common ancestors: simplification and parallelization. SIAM J. Comput. 17, 1253–1262 (1988)
Weiner, P.: Linear pattern matching algorithm. SWAT 1973, 1–11 (1973)
Acknowledgements
Project sponsored by the National Security Agency under Grant Number H98230-15-1-0232. The United States Government is authorized to reproduce and distribute reprints notwithstanding any copyright notation herein. This manuscript is submitted for publication with the understanding that the United States Government is authorized to reproduce and distribute reprints. This material is based upon work supported by the National Science Foundation under Grant No. DMS–1060775.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Blanchet-Sadri, F., Harred, R., Lazarow, J. (2016). Longest Common Extensions in Partial Words. In: Lipták, Z., Smyth, W. (eds) Combinatorial Algorithms. IWOCA 2015. Lecture Notes in Computer Science(), vol 9538. Springer, Cham. https://doi.org/10.1007/978-3-319-29516-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-29516-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29515-2
Online ISBN: 978-3-319-29516-9
eBook Packages: Computer ScienceComputer Science (R0)