Abstract
The Longest Common Extension problem considers a string s and computes, for each of a number of pairs (i,j), the longest substring of s that starts at both i and j. It appears as a subproblem in many fundamental string problems and can be solved by linear-time preprocessing of the string that allows (worst-case) constant-time computation for each pair. The two known approaches use powerful algorithms: either constant-time computation of the Lowest Common Ancestor in trees or constant-time computation of Range Minimum Queries (RMQ) in arrays. We show here that, from practical point of view, such complicated approaches are not needed. We give two very simple algorithms for this problem that require no preprocessing. The first needs only the string and is significantly faster than all previous algorithms on the average. The second combines the first with a direct RMQ computation on the Longest Common Prefix array. It takes advantage of the superior speed of the cache memory and is the fastest on virtually all inputs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abouelhoda, M.I., Kurtz, S., Ohlenbusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2, 53–86 (2004)
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Berkman, O., Vishkin, U.: Recursive star-tree parallel data structure. SIAM J. Comput. 22, 221–242 (1993)
de Bruijn, N.G.: A combinatorial problem. Nederl. Akad. Wetensch. Proc. 49, 758–764 (1946)
Fischer, J., Heun, V.: Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 36–48. Springer, Heidelberg (2006)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Gusfield, D., Stoye, J.: Linear time algorithm for finding and representing all tandem repeats in a string. J. Comput. Syst. Sci. 69, 525–546 (2004)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13, 338–355 (1984)
Kärkkäinen, J., Sanders, P.: Simple linear work suffix array construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Heidelberg (2003)
Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)
Kim, D.K., Sim, J.S., Park, H., Park, K.: Constructing suffix arrays in linear time. J. Discrete Algorithms 3(2-4), 126–142 (2005)
Ko, P., Aluru, S.: Space efficient linear time construction of suffix arrays. Discrete Algorithms 3(2-4), 143–156 (2005)
Landau, G., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)
Landau, G., Vishkin, U.: Introducing efficient parallelism into approximate string matching and a new serial algorithm. In: Proc. of STOC, pp. 220–230. ACM Press, New York (1986)
Main, M., Lorentz, R.J.: An O(n log n) algorithm for finding all repetitions in a string. J. Algorithms 5, 422–432 (1984)
Manber, U., Myers, G.: Suffix arrays: a new method for on-line search. SIAM J. Comput. 22(5), 935–948 (1993)
de Castro Miranda, R., Ayala-Rincón, M.: A Modification of the Landau-Vishkin Algorithm Computing Longest Common Extensions via Suffix Arrays. In: Setubal, J.C., Verjovski-Almeida, S. (eds.) BSB 2005. LNCS (LNBI), vol. 3594, pp. 210–213. Springer, Heidelberg (2005)
Myers, G.: An O(nd) difference algorithm and its variations. Algorithmica 1, 251–266 (1986)
Schieber, B., Vishkin, U.: On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput. 17, 1253–1262 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ilie, L., Tinta, L. (2009). Practical Algorithms for the Longest Common Extension Problem. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds) String Processing and Information Retrieval. SPIRE 2009. Lecture Notes in Computer Science, vol 5721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03784-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-03784-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03783-2
Online ISBN: 978-3-642-03784-9
eBook Packages: Computer ScienceComputer Science (R0)