Skip to main content

Tighter Bounds for the Sum of Irreducible LCP Values

  • Conference paper
  • First Online:
  • 867 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9133))

Abstract

The suffix array is frequently augmented with the longest-common-prefix (LCP) array that stores the lengths of the longest common prefixes between lexicographically adjacent suffixes of a text. While the sum of the values in the LCP array can be \(\Omega (n^2)\) for a text of length \(n\), the sum of so-called irreducible LCP values was shown to be \(\mathcal {O}(n\lg n)\) just a few years ago. In this paper, we improve the bound to \(\mathcal {O}(n\lg r)\), where \(r\le n\) is the number of runs in the Burrows-Wheeler transform of the text. We also show that our bound is tight up to lower order terms (unlike the previous bound). Our results and the techniques used in proving them provide new insights into the combinatorics of text indexing and compression, and have immediate applications to LCP array construction algorithms.

Partially supported by the project “Enhancing Educational Potential of Nicolaus Copernicus University” (project no. POKL.04.01.01-00-081/10).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Throughout the paper we use \(\lg \) as a shorthand for \(\log _2\).

  2. 2.

    We use the double brace notation \(\{\!\!\{{\cdot }\}\!\!\}\) to denote a multiset as opposed to a set.

References

  1. Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2(1), 53–86 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  2. Burrows, M., Wheeler, D.J.: A block sorting lossless data compression algorithm. Technical report 124, Digital Equipment Corporation, Palo Alto, California (1994)

    Google Scholar 

  3. Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Amer. Math. Soc. 16(1), 109–114 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  4. Higgins, P.M.: Burrows-Wheeler transformations and de Bruijn words. Theor. Comput. Sci. 457, 128–136 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  5. Kärkkäinen, J., Kempa, D.: LCP array construction in external memory. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 412–423. Springer, Heidelberg (2014)

    Google Scholar 

  6. Kärkkäinen, J., Manzini, G., Puglisi, S.J.: Permuted longest-common-prefix array. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 181–192. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Mäkinen, V., Navarro, G., Sirén, J., Välimäki, N.: Storage and retrieval of highly repetitive sequence collections. J. Comp. Biol. 17(3), 281–308 (2010)

    Article  Google Scholar 

  8. Manber, U., Myers, G.W.: Suffix arrays: a new method for on-line string searches. SIAM J. Comp. 22(5), 935–948 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  9. Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: An extension of the Burrows-Wheeler transform. Theor. Comput. Sci. 387(3), 298–312 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  10. Manzini, G.: Two space saving tricks for linear time LCP array computation. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 372–383. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv. 39(1), 1–61 (2007)

    Article  Google Scholar 

  12. Bioinformatics Algorithms: Sequence Analysis, Genome Rearrangements, and Phylogenetic Reconstruction. Oldenbusch Verlag, Bremen, Germany (2013)

    Google Scholar 

  13. Ruskey, F.: Combinatorial generation, working version (1j-CSC 425/520) (2003)

    Google Scholar 

  14. Sirén, J.: Sampled longest common prefix array. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 227–237. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juha Kärkkäinen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kärkkäinen, J., Kempa, D., Piątkowski, M. (2015). Tighter Bounds for the Sum of Irreducible LCP Values. In: Cicalese, F., Porat, E., Vaccaro, U. (eds) Combinatorial Pattern Matching. CPM 2015. Lecture Notes in Computer Science(), vol 9133. Springer, Cham. https://doi.org/10.1007/978-3-319-19929-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19929-0_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19928-3

  • Online ISBN: 978-3-319-19929-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics