Skip to main content

Computing Similarity of Run-Length Encoded Strings with Affine Gap Penalty

  • Conference paper
String Processing and Information Retrieval (SPIRE 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3772))

Included in the following conference series:

Abstract

The problem of computing similarity of two run-length encoded strings has been studied for various scoring metrics. Many algorithms have been developed for the longest common subsequence metric and some algorithms for the Levenshtein distance metric and the weighted edit distance metric. In this paper we consider similarity based on the affine gap penalty metric which is a more general and rather complicated scoring metric than the weighted edit distance. To compute similarity in this model efficiently, we convert the problem to a path problem on a directed acyclic graph and use some properties of maximum paths in this graph. We present an O(nm′+nm) time algorithm for computing similarity of two run-length encoded strings in the affine gap penalty model, where n′ and m′ are the lengths of given two run-length encoded strings, and n and m are the decoded lengths of given two strings, respectively.

This work was supported by FPR05A2-341 of 21C Frontier Functional Proteomics Project from Korean Ministry of Science & Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Landau, G.M., Skiena, S.: Matching for Run Length Encoded Strings. Journal of Complexity 15(1), 4–16 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Arbell, O., Landau, G.M., Mitchell, J.: Edit Distance of Run-Length Encoded Strings. Information Processing Letters 83(6), 307–314 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  3. Bunke, H., Csirik, H.: An Improved Algorithm for Computing the Edit Distance of Run Length Coded Strings. Information Processing Letters 54, 93–96 (1995)

    Article  MATH  Google Scholar 

  4. Crochemore, M., Landau, G.M., Schieber, B., Ziv-Ukelson, M.: Re-Use Dynamic Programming for Sequence Alignment: An Algorithmic Toolkit. In: String Algorithmices. NATO Book series, KCL Press (2004)

    Google Scholar 

  5. Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices. SIAM Journal on Computing 32(6), 1654–1673 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  6. Gajewska, H., Tarjan, R.E.: Deques with Heap Order. Information Processing Letters 22, 197–200 (1986)

    Article  Google Scholar 

  7. Gotoh, O.: An Improved Algorithm for Matching Biological Sequences. Journal of Molecular Biology 162, 705–708 (1982)

    Article  Google Scholar 

  8. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  9. Huang, X., Miller, W.: A Time-Efficient, Linear-Space Local Similarity Algorithm. Advances in Applied Mathematics 12, 337–357 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kim, J.W., Park, K.: An Efficient Local Alignment Algorithm for Masked Sequences. In: Chwa, K.-Y., Munro, J.I.J. (eds.) COCOON 2004. LNCS, vol. 3106, pp. 440–449. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Mäkinen, V., Navarro, G., Ukkonen, E.: Approximate Matching of Run-Length Compressed Strings. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 31–49. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Mäkinen, V., Navarro, G., Ukkonen, E.: Approximate Matching of Run-Length Compressed Strings. Algorithmica 35, 347–369 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  13. Mitchell, J.: A Geometric Shortest Path Problem, with Application to Computing a Longest Common Subsequence in Run-Length Encoded Strings. Technical Report, Dept. of Applied Mathematics, SUNY Stony Brook (1997)

    Google Scholar 

  14. Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147, 195–197 (1981)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, J.W., Amir, A., Landau, G.M., Park, K. (2005). Computing Similarity of Run-Length Encoded Strings with Affine Gap Penalty. In: Consens, M., Navarro, G. (eds) String Processing and Information Retrieval. SPIRE 2005. Lecture Notes in Computer Science, vol 3772. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11575832_35

Download citation

  • DOI: https://doi.org/10.1007/11575832_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29740-6

  • Online ISBN: 978-3-540-32241-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics