Skip to main content

Identifying periodic occurrences of a template with applications to protein structure

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1992)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 644))

Included in the following conference series:

Abstract

We consider a string matching problem where the pattern is a template that matches many different strings with various degrees of perfection. The quality of a match is given by a penalty matrix that assigns each pair of characters a score that characterizes how well the characters match. Superfluous characters in the text and superfluous characters in the pattern may also occur and the respective penalties for such gaps in the alignment are also given by the penalty matrix. For a text T of length n, and a template P of length m, we wish to find the best alignment of T with P n, which is the concatenation of n copies of P, (m will typically be much smaller than n). Such an alignment can simply be obtained by solving a dynamic programming problem of size O(n2 m), and ignoring the periodic character of P n. We show that the structure of Pn can be exploited and the problem reduced to essentially solving a dynamic programming of size O(mn). If the complexity of computing gap penalties is O(1), (which is frequently the case), our algorithm runs in O(mn) time. The problem was motivated by a protein structure problem.

Partially supported by NSF grant CCR-8908286

Partially supported by NSF grant CCR-9110255 and the New York State Science and Technology Foundation Center for Advanced Technology

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P.Y. Chou and G.D. Fasman, “Prediction of protein conformation,” Biochemistry, Vol. 13, 1974, pp. 222–245.

    Google Scholar 

  2. C. Cohen, and D.A.D. Parry, “Alpha-helical coiled coils — a widespread motif in proteins,” T.I.B.S., Vol. 11, 1986, pp. 245–248.

    Google Scholar 

  3. J. F. Conway and D. A. D. Parry, “Structural features in the heptad substructure and longer range repeats of two-stranded alpha-fibrous proteins,” Int. J. Biol. Macromol., Vol. 4, 1990, pp. 328–333.

    Google Scholar 

  4. V. A. Fischetti, V. Pancholi, P. Sellers, J. Schmidt, G. Landau, X. Xu, O. Schneewind, Streptococcal M protein: A common Structural Motif Used by Gram-positive Bacteria for Biological Active Surface Molecules, to appear Molecular Recognition in Host-Parasite Interactions: Mechanisms in viral, bacterial and parasite infections. Published by Plenum Publishing.

    Google Scholar 

  5. Z. Galil and R. Giancarlo, “Speeding up dynamic programming with applications to molecular biology,” Theoretical Computer Science, Vol. 64, 1989, pp. 107–118.

    Google Scholar 

  6. M. Gribskov, A.D. McLachlan, and D. Eisenberg, “Profile analysis: Detection of distantly related proteins,” Proc. Natl. Acad. Sci., Vol. 84, 1987, pp. 4355–4358.

    Google Scholar 

  7. J. Garnier, D.J. Osguthorpe, and B. Robson, “Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins,” J. Molecular Biology, Vol. 120, 1978, pp. 97–120.

    Google Scholar 

  8. Z. Galil and K. Park, “An Improved Algorithm for Approximate String Matching,” SIAM J. Comp., Vol. 19, 1990, pp. 989–999.

    Google Scholar 

  9. A. Lupas, M. Van Dyke, J. Stock, “Predicting Coiled Coil from Protein Sequences, Science Vol. 252, 1990, pp. 1162–1164.

    Google Scholar 

  10. R. Lüthy, A. D. McLachlan, and D. Eisenberg Secondary Structure-Based Profiles: Use of Structure-Conserving Scoring Tables in Searching Protein Sequence Databases for Structural Similarities'” Proteins, Vol. 10, 1991, pp. 229–239.

    Google Scholar 

  11. G.M. Landau and U. Vishkin, “Fast parallel and serial approximate string matching,” Journal of Algorithms, Vol. 10, No. 2, June 1989, pp. 157–169.

    Google Scholar 

  12. S.B. Needleman and C.D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequences of two proteins,” J. Molecular Biology, Vol. 48, 1969, pp. 443–453.

    Google Scholar 

  13. P.H. Sellers, “On the theory and computation of evolutionary distance,” SIAM J. Appl. Math, Vol. 26, No. 4, 1974, pp. 787–793.

    Google Scholar 

  14. D. Sankoff and J.B. Kruskal (editors), Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, MA, 1983.

    Google Scholar 

  15. E. Ukkonen, “On approximate string matching,” Proc. Int. Conf. Found. Comp. Theor., Lecture Notes in Computer Science 158, Springer-Verlag, 1983, pp. 487–495.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Maxime Crochemore Zvi Galil Udi Manber

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fischetti, V.A., Landau, G.M., Schmidt, J.P., Sellers, P.H. (1992). Identifying periodic occurrences of a template with applications to protein structure. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-56024-6_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56024-1

  • Online ISBN: 978-3-540-47357-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics