Sparse Normalized Local Alignment

Efraty, Nadav; Landau, Gad M.

doi:10.1007/978-3-540-27801-6_25

Nadav Efraty¹⁸ &
Gad M. Landau^18,19

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3109))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

616 Accesses
4 Citations

Abstract

Given two strings, X and Y, both of length O(n) over alphabet Σ, a basic problem (local alignment) is to find pairs of similar substrings, one from X and one from Y. For substrings X′ and Y′ from X and Y, respectively, the metric we use to measure their similarity is normalized alignment value: LCS(X′,Y′)/(|X′|+|Y′|). Given an integer M we consider only those substrings whose LCS length is at least M. We present an algorithm that reports the pairs of substrings with the highest normalized alignment value in O(nlog|Σ| + rMloglogn) time (r– the number of matches between X and Y). We also present an O(nlog|Σ| + rLloglogn) algorithm (L = LCS(X,Y)) that reports all substring pairs with a normalized alignment value above a given threshold.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apostolico, A.: String editing and longest common subsequence. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, vol. 2, pp. 361–398. Springer, Berlin (1997)
Google Scholar
Apostolico, A., Galil, Z.: Pattern matching algorithms. Oxford University Press, Oxford (1997)
MATH Google Scholar
Apostolico, A., Guerra, C.: The Longest Common Subsequence Problem Revisited. Algorithmica 2, 315–336 (1987)
Article MATH MathSciNet Google Scholar
Arslan, A.N., E˘gecio˘glu, O., Pevzner, P.A.: A new approach to sequence comparison: normalized sequence alignment. Bioinformatics 17(4), 327–337 (2001)
Article Google Scholar
Claus, R.: Efficient computation of all longest common subsequences. In: Halldórsson, M.M. (ed.) SWAT 2000. LNCS, vol. 1851, pp. 407–418. Springer, Heidelberg (2000)
Chapter Google Scholar
Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, Oxford (1994)
MATH Google Scholar
Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2002)
Book Google Scholar
Eppstein, D., Galil, Z., Giancarlo, R., Italiano, G.F.: Sparse Dynamic Programming I: Linear Cost Functions. JACM 39, 546–567 (1992)
Article MATH MathSciNet Google Scholar
Gusfield, D.: Algorithms on strings, trees, and sequences. Cambridge University Press, Cambridge (1997)
Book MATH Google Scholar
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. JACM 24(4), 664–675 (1977)
Article MATH MathSciNet Google Scholar
Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequence. Communications of the ACM 20, 350–353 (1977)
Article MATH MathSciNet Google Scholar
Johnson, D.B.: A priority queue in which initialization and queue operations take O(loglog D) time. Math. Syst. Theory 15, 295–309 (1982)
Article MATH Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl 10, 707–710 (1966)
MathSciNet Google Scholar
Myers, E.W.: Incremental Alignment Algorithms and their Applications. Tech. Rep. 86-22, Dept. of Computer Science, U. of Arizona (1986)
Google Scholar
Navarro, G., Raffinot, M.: Flexible pattern matching in strings practical on-line search algorithms for text and biological sequences. Cambridge University Press, Cambridge (2002)
Google Scholar
Smith, T., Waterman, M.S.: The identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Article Google Scholar
Ukkonen, E.: On-line construction of suffix trees. Technical Report No A-1993- 1, Department of Computer Science, University of Helsinki (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Haifa University, Haifa, 31905, Israel
Nadav Efraty & Gad M. Landau
Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY, 11201-3840, USA
Gad M. Landau

Authors

Nadav Efraty
View author publications
You can also search for this author in PubMed Google Scholar
Gad M. Landau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing Science, Simon Fraser University, 8888 University Drive, V5A 1S6, Burnaby, BC, Canada
Suleyman Cenk Sahinalp
Google Inc., 76 9th Av, 4th Fl., 10011, New York, NY
S. Muthukrishnan
Tom Sawyer Software, 94612, Oakland, CA, USA
Ugur Dogrusoz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Efraty, N., Landau, G.M. (2004). Sparse Normalized Local Alignment. In: Sahinalp, S.C., Muthukrishnan, S., Dogrusoz, U. (eds) Combinatorial Pattern Matching. CPM 2004. Lecture Notes in Computer Science, vol 3109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27801-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-540-27801-6_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22341-2
Online ISBN: 978-3-540-27801-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics