Abstract
In the context of non-coding RNA (ncRNA) multiple structural alignment, Davydov and Batzoglou introduced in [7] the problem of finding the largest nested linear graph that occurs in a set \({\mathcal{G}}\) of linear graphs, the so-called Max-NLS problem. This problem generalizes both the longest common subsequence problem and the maximum common homeomorphic subtree problem for rooted ordered trees.
In the present paper, we give a fast algorithm for finding the largest nested linear subgraph of a linear graph and a polynomial-time algorithm for a fixed number (k) of linear graphs. Also, we strongly strengthen the result of [7] by proving that the problem is NP-complete even if \({\mathcal{G}}\) is composed of nested linear graphs of height at most 2, thereby precisely defining the borderline between tractable and intractable instances of the problem. Of particular importance, we improve the result of [7] by showing that the Max-NLS problem is approximable within ratio O(logm opt ) in O(kn 2) running time, where m opt is the size of an optimal solution. We also present \({{\mathcal O}}(1)\)-approximation of Max-NLS problem running in \({{\mathcal O}}(kn)\) time for restricted linear graphs. In particular, for ncRNA derived linear graphs, an \(\frac{1}{4}\)-approximation is presented.
This research was partially supported by the Polish Scientific Research Committee (KBN) under grant GR-1946 and by the French-Italian Galileo Project PAI 08484VH.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abouelhoda, M.I., Ohlebusch, E.: Multiple Genome Alignment: Chaining Algorithms Revisited. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 1–16. Springer, Heidelberg (2003)
Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings, vol. 937, pp. 1–16. Springer, Berlin (1995)
Bafna, V., Tang, H., Zhang, S.: Consensus Folding of Unaligned RNA Sequences Revisited. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 172–187. Springer, Heidelberg (2005)
Bereg, S., Zhu, B.: RNA multiple structural alignment with longest common subsequences. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 32–41. Springer, Heidelberg (2005)
Bodlaender, H.L., Kloks, T., Kratsch, D., Müller, H.: Treewidth and minimum fill-in on d-trapezoid graphs. Journal of Graph Algorithms and Applications 2(5), 1–23 (1998)
Dagan, I., Golumbic, M.C., Pinter, R.Y.: Trapezoid graphs and their coloring. Discrete Applied Mathematics 21, 35–46 (1988)
Davydov, E., Batzoglou, S.: A Computational Model for RNA Multiple Structural Alignment. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 254–269. Springer, Heidelberg (2004)
Felsner, S., Müller, R., Wernisch, L.: Trapezoid graphs and generalizations: Geometry and algorithms. Discrete Applied Math. 74, 13–32 (1997)
Flotow, C.: On powers of m-trapezoid graphs. Discrete Applied Mathematics 63(2), 187–192 (1995)
Golumbic, M.C.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York (1980)
Gramm, J., Guo, J., Niedermeier, R.: Pattern Matching for arc-annotated sequences. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS, vol. 2556, pp. 182–193. Springer, Heidelberg (2002)
Holmes, I., Rubin, G.M.: Pairwise RNA structure comparison with stochastic context-free grammars. In: Pacific Symposium on Biocomputing, pp. 163–174 (2002)
Lin, G., Chen, Z.-Z., Jiang, T., Wen, J.: The longest common subsequence problem for sequences with nested arc annotations. Journal of Computer and System Sciences 65(3), 465–480 (2002) (Special issue on computational biology)
Liu, J., Wang, J.T., Hu, J., Tian, B.: A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 6(89) (2005)
Lozano, A., Valiente, G.: On the maximum common embedded subtree problem for ordered trees. In: Iliopoulos, C., Lecroq, T. (eds.) String Algorithmics, ch. 7. King’s College London Publications (2004)
Nussinov, R., Pieczenik, G., Griggs, J.R., Kleitman, D.J.: Algorithms for loop matching. SIAM Journal of Applied Mathematics 35(1), 68–82 (1978)
Vialette, S.: On the computational complexity of 2-interval pattern matching. Theoretical Computer Science 312(2-3), 223–249 (2004)
Waterman, M.S.: Introduction to computational biology - Maps, sequences and genomes. Chapman and Hall, London (1995)
Zhang, K., Shacha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing 18(6), 1245–1262 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kubica, M., Rizzi, R., Vialette, S., Waleń, T. (2006). Approximation of RNA Multiple Structural Alignment. In: Lewenstein, M., Valiente, G. (eds) Combinatorial Pattern Matching. CPM 2006. Lecture Notes in Computer Science, vol 4009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780441_20
Download citation
DOI: https://doi.org/10.1007/11780441_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35455-0
Online ISBN: 978-3-540-35461-1
eBook Packages: Computer ScienceComputer Science (R0)