Abstract
The primary structure of a ribonucleic acid (RNA) molecule is a sequence of nucleotides (bases) over the alphabet {A;C;G;U}. The secondary or tertiary structure of an RNA is a set of base-pairs (nucleotide pairs) which forms bonds between A - U and C - G. For secondary structures, these bonds have been traditionally assumed to be one-to-one and non-crossing
This paper considers a notion of similarity between two RNA molecule structures taking into account the primary, the secondary and the tertiary structures. We show that in general this problem is NP-hard for tertiary structures. We present algorithms for the case where at least one of the RNA involved is of secondary structures. We then show that this algorithm might be used to deal with the practical application. We also show an approximation algorithm.
Research supported partially by the Natural Sciences and Engineering Research Council of Canada under Grant No. OGP0046373.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
V. Bafna, S. Muthukrishnan, and R. Ravi, ‘Comparing similarity between RNA strings’, Proc. Combinatorial Pattern Matching Conf. 95, LNCS 937, pp.1–14, 1995
F. Corpet and B. Michot, ‘RNAlign program: alignment of RNA sequences using both primary and secondary structures’, Comput. Appl. Biosci vol. 10, no. 4, pp. 389–399, 1995
P.N. Klein, ‘Computing the edit-distance between unrooted ordered trees’, Proc. Annual European Symposium on Algorithms 98 LNCS 1461, pp.91–102, 1998.
S.Y. Le, R. Nussinov and J.V. Mazel, ‘Tree graphs of RNA secondary structures and their comparisons’ Comput. Biomed. Res. vol. 22, pp.461–473, 1989
S.Y. Le, J. Owens, R. Nussinov, J.H. chen, B. Shapiro, and J.V. Mazel, ‘RNA secondary structures: comparisons and determination of frequently recurring sub-structures by consensus’, Comput. Appl. Biosci vol. 5, pp.205–210, 1989
S.E. Needleman and C.D. Wunsch, ‘A general method applicable to the search for similarities in the amino-acid sequences of two proteins’, J. Mol. Bio., 48, pp.443–453, 1970
B. Shapiro, ‘An algorithm for comparing multiple RNA secondary structures’, Comput. Appl. Biosci vol. 4, no. 3, pp.387–393, 1988
B. Shapiro and K. Zhang, ‘Comparing multiple RNA secondary structures using tree comparisons’, Comput. Appl. Biosci vol. 6, no.4, pp.309–318, 1990
T.F. Smith and M.S. Waterman, ‘The identi_cation of common molecular subsequences’, J. Mol. Bio. 147, pp.195–197, 1981
T.F. Smith and M.S. Waterman, ‘Comparison of biosequences’, Adv. in Appl. Math.2, pp.482–489, 1981
K.C. Tai, ‘The tree to tree correction problem’, JACM vol.26, no.3, pp.422–433, 1979
Kaizhong Zhang, ‘Computing similarity between RNA secondary structures’, Proceedings of IEEE International Joint Symposia on Intelligence and Systems, Rockville, Maryland, May 1998, pp. 126–132.
K. Zhang and D. Shasha, ‘Simple fast algorithms for the editing distance between trees and related problems’, SIAM J. Computing vol. 18, no. 6, pp.1245–1262, 1989
M. Zuker, ‘On finding all suboptimal foldings from of an RNA molecule’, Science 244, pp.48–52, 1989
M. Zuker and D. Sankoff, ‘RNA secondary structure and their prediction’, Bull.Math. Biol. 46, pp.591–621, 1984
M. Zuker and P. Stiegler, ‘Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information’, Nucleic Acid Res. 9, pp.133–148, 1981
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, K., Wang, L., Ma, B. (1999). Computing Similarity between RNA Structures. In: Crochemore, M., Paterson, M. (eds) Combinatorial Pattern Matching. CPM 1999. Lecture Notes in Computer Science, vol 1645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48452-3_21
Download citation
DOI: https://doi.org/10.1007/3-540-48452-3_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66278-5
Online ISBN: 978-3-540-48452-3
eBook Packages: Springer Book Archive