Skip to main content

Longest Common Extensions in Partial Words

  • Conference paper
  • First Online:
Combinatorial Algorithms (IWOCA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9538))

Included in the following conference series:

  • 579 Accesses

Abstract

The Longest Common Extension of a pair of positions (ij) in a string, or word, is the longest substring starting at i and j. The LCE problem considers a word and a set of pairs of positions and computes for each pair in the set, the longest common extension starting at both positions in the pair. This problem finds applications in matching with don’t-care characters, approximate string searching, finding all exact or approximate tandem repeats, to name a few. From a practical point of view, Ilie et al. (Journal of Discrete Algorithms, 2010) looked for simple and efficient algorithms for the LCE problem. In this paper, we extend their analyses to partial words, strings with don’t-cares or holes. In this context, we compute the Longest Common Compatible Extension of each pair of positions (ij) in a partial word, i.e., the longest substrings starting at i and j that are compatible. We show that our results match with those of total words (partial words without holes). We find that one of the simplest algorithms for implementing the LCE problem is optimal on average in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  2. Berkman, O., Vishkin, U.: Recursive star-free parallel data structure. SIAM J. Comput. 22, 221–242 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  3. Blanchet-Sadri, F., Lazarow, J.: Suffix trees for partial words and the longest common compatible prefix problem. In: Dediu, A.-H., Martín-Vide, C., Truthe, B. (eds.) LATA 2013. LNCS, vol. 7810, pp. 165–176. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Crochemore, M., Iliopoulos, C.S., Kociumaka, T., Kubica, M., Langiu, A., Radoszewski, J., Rytter, W., Szreder, B., Waleń, T.: A note on the longest common compatible prefix problem for partial words (2013). arxiv:1312.2381v1

  5. Fischer, J., Heun, V.: Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 36–48. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  7. Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69, 525–546 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ilie, L., Navarro, G., Tinta, L.: The longest common extension problem revisited and applications to approximate string searching. J. Discrete Algorithms 8(4), 418–428 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kärkkäinen, J., Sanders, P.: Simple linear work suffix array construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Berlin (2003)

    Google Scholar 

  11. Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Landau, G., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8, 1–18 (2001)

    Article  Google Scholar 

  13. Landau, G., Vishkin, U.: Introducing efficient parallelism into approximate string matching and a new serial algorithm. In: STOC 1986, pp. 220–230. ACM Press (1986)

    Google Scholar 

  14. Landau, G., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10, 157–169 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  15. Main, M.G., Lorentz, R.J.: An O(nlog n) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  16. de Castro Miranda, R., Ayala-Rincón, M.: A modification of the Landau-Vishkin algorithm computing longest common extensions via suffix arrays. In: Setubal, J.C., Verjovski-Almeida, S. (eds.) BSB 2005. LNCS (LNBI), vol. 3594, pp. 210–213. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  17. Myers, G.: An O(nd) difference algorithm and its variations. Algorithmica 1, 251–266 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  18. Schieber, B., Vishkin, U.: On finding lowest common ancestors: simplification and parallelization. SIAM J. Comput. 17, 1253–1262 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  19. Weiner, P.: Linear pattern matching algorithm. SWAT 1973, 1–11 (1973)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

Project sponsored by the National Security Agency under Grant Number H98230-15-1-0232. The United States Government is authorized to reproduce and distribute reprints notwithstanding any copyright notation herein. This manuscript is submitted for publication with the understanding that the United States Government is authorized to reproduce and distribute reprints. This material is based upon work supported by the National Science Foundation under Grant No. DMS–1060775.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francine Blanchet-Sadri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Blanchet-Sadri, F., Harred, R., Lazarow, J. (2016). Longest Common Extensions in Partial Words. In: Lipták, Z., Smyth, W. (eds) Combinatorial Algorithms. IWOCA 2015. Lecture Notes in Computer Science(), vol 9538. Springer, Cham. https://doi.org/10.1007/978-3-319-29516-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29516-9_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29515-2

  • Online ISBN: 978-3-319-29516-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics