Skip to main content

Space-Economical Algorithms for Finding Maximal Unique Matches

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2373))

Included in the following conference series:

Abstract

We show space-economical algorithms for finding maximal unique matches (MUM’s) between two strings which are important in large scale genome sequence alignment problems. Our algorithms require only O(n) bits (O(n/ log n) words) where n is the total length of the strings. We propose three algorithms for different inputs: When the input is only the strings, their compressed suffix array, or their compressed suffix tree. Their time complexities are O(n log n), O(n logε n) and O(n) respectively, where ε is any constant between 0 and 1. We also show an algorithm to construct the compressed suffix tree from the compressed suffix array using O(n logε n) time and O(n) bits space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. L. Delcher, S. Kasif, R. D. Fleischmann, J. Peterson, O. White, and S. L. Salzberg. Alignment of Whole Genomes. Nucleic Acids Research, 27:2369–2376, 1999.

    Article  Google Scholar 

  2. P. Elias. Universal codeword sets and representation of the integers. IEEE Trans. Inform. Theory, IT-21(2):194–203, March 1975.

    Google Scholar 

  3. P. Ferragina and G. Manzini. Opportunistic Data Structures with Applications. In 41st IEEE Symp. on Foundations of Computer Science, pages 390–398, 2000.

    Google Scholar 

  4. R. Grossi and J. S. Vitter. Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. In 32nd ACM Symposium on Theory of Computing, pages 397–406, 2000.

    Google Scholar 

  5. D. Gusfield. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, 1997.

    Google Scholar 

  6. T. Kasai, G. Lee, H. Arimura, S. Arikawa, and K. Park. Linear-time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications. In Proc. the 12th Annual Symposium on Combinatorial Pattern Matching (CPM’01), LNCS 2089, pages 181–192, 2001.

    Google Scholar 

  7. S. Kurtz. Reducing the Space Requirement of Suffix Trees. Software-Practice and Experience, 29(13):1149–1171, 1999.

    Article  Google Scholar 

  8. T. W. Lam, K. Sadakane, W. K Sung, and S. M Yiu. working draft.

    Google Scholar 

  9. J. I. Munro and V. Raman. Succinct Representation of Balanced Parentheses and Static Trees. SIAM Journal on Computing, 31(3):762–776, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  10. J. I. Munro, V. Raman, and S. Srinivasa Rao. Space Efficient Suffix Trees. Journal of Algorithms, 39(2):205–222, May 2001.

    Google Scholar 

  11. K. Sadakane. Compressed Text Databases with Efficient Query Algorithms based on the Compressed Suffix Array. In Proceedings of ISAAC’00, number 1969 in LNCS, pages 410–421, 2000.

    Google Scholar 

  12. K. Sadakane. Succinct Representations of lcp Information and Improvements in the Compressed Suffix Arrays. In Proc. ACM-SIAM SODA 2002, pages 225–232, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hon, WK., Sadakane, K. (2002). Space-Economical Algorithms for Finding Maximal Unique Matches. In: Apostolico, A., Takeda, M. (eds) Combinatorial Pattern Matching. CPM 2002. Lecture Notes in Computer Science, vol 2373. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45452-7_13

Download citation

  • DOI: https://doi.org/10.1007/3-540-45452-7_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43862-5

  • Online ISBN: 978-3-540-45452-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics