Abstract
Given a sequence A of length M and a regular expression R of length P, an approximate regular expression pattern matching algorithm computes the score of the best alignment between A and one of the sequences exactly matched by R. There are a variety of schemes for scoring alignments. In a concave gap-penalty scoring scheme, a function δ(a, b) gives the score of each aligned pair of symbols a and b, and a concave function w(k) gives the score of a sequence of unaligned symbols, or gap, of length k. A function w is concave if and only if it has the property that for all k > 1, w(k+ 1)-w(k) <w(k) — w(k−1). In this paper we present an O(MP(log M + log2 P)) algorithm for approximate regular expression matching for an arbitrary δ and any concave w.
This work was supported partially by the National Institute of Health under Grant R01 LM04960 and the Aspen Center for Physics
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Allen, F. E.: Control Flow Analysis. SIGPLAN Notices 5 (1970) 1–19.
Galil, Z., Giancarlo, R.: Speeding Up Dynamic Programming with Applications to Molecular Biology. Theo. Comp. Sci. 64 (1989) 107–118.
Hecht, M. S., Ullman, J. D.: A Simple Algorithm for Global Dataflow Analysis Programs. SIAM J. Comp. 4(4) (1975) 519–532.
Hirschberg, D. S., Larmore, L. L.: The Least Weight Subsequence Problem. SIAM J. Comp. 16(4) (1987) 628–638.
Hopcroft, J. E., Ullman, J. D.: Introduction to Automata Theory, Languages, and Computation, Chapter 2. Reading: Addison-Wesley, 1979.
Knight, J. R., Myers, E. W.: Approximate Regular Expression Pattern Matching with Concave Gap Penalties. TR 92-12, Dept. of CS, Univ. of Arizona, Tucson, AZ, 1992.
Miller, W., Myers, E. W.: Sequence Comparison with Concave Weighting Functions. Bull. Math. Bio. 50(2) (1988) 97–120.
Myers, E. W.: Efficient Applicative Data Types. Proc. 11th Symp. POPL (1984) 66–75.
Myers, E. W., Miller, W.: Approximate Matching of Regular Expressions. Bull. Math. Bio. 51(1) (1989) 33–56.
Needleman, S. B., Wunsch, C. D.: A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. J. Mole. Bio. 48 (1970) 443–453.
Sankoff, D.: Matching Sequences Under Deletion/Insertion Constraints. Proc. Nat. Acad. Sci. U. S. A. 69 (1972) 4–6.
Sleator, D. D., Tarjan, R. E.: Self-Adjusting Binary Search Trees. J. ACM 32(3) (1985) 652–686.
Wagner, R. A., Fischer, M. J.: The String-to-String Correction Problem. J. ACM 21(1) (1974) 168–173.
Waterman, M. S.: General Methods of Sequence Comparison. Bull. Math. Bio. 46 (1984) 473–501.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Knight, J.R., Myers, E.W. (1992). Approximate regular expression pattern matching with concave gap penalties. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_6
Download citation
DOI: https://doi.org/10.1007/3-540-56024-6_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56024-1
Online ISBN: 978-3-540-47357-2
eBook Packages: Springer Book Archive