Skip to main content

Exact Pattern Matching for RNA Structure Ensembles

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2012)

Abstract

ExpaRNA’s core algorithm computes, for two fixed RNA structures, a maximal non-overlapping set of maximal exact matchings. We introduce an algorithm ExpaRNA-P that solves the lifted problem of finding such sets of exact matchings in entire Boltzmann-distributed structure ensembles of two RNAs. Due to a novel kind of structural sparsification, the new algorithm maintains the time and space complexity of the algorithm for fixed input structures. Furthermore, we generalized the chaining algorithm of ExpaRNA in order to compute a compatible subset of ExpaRNA-P’s exact matchings. We show that ExpaRNA-P outperforms ExpaRNA in BRAliBase 2.1 benchmarks, where we pass the chained exact matchings as anchor constraints to the RNA alignment tool LocARNA. Compared to LocARNA, this novel approach shows similar accuracy but is six times faster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The FANTOM Consortium: The transcriptional landscape of the mammalian genome. Science 309(5740), 1559–1563 (2005)

    Article  Google Scholar 

  2. Cheng, J., Kapranov, P., Drenkow, J., Dike, S., Brubaker, S., Patel, S., Long, J., Stern, D., Tammana, H., Helt, G., Sementchenko, V., Piccolboni, A., Bekiranov, S., Bailey, D.K., Ganesh, M., Ghosh, S., Bell, I., Gerhard, D.S., Gingeras, T.R.: Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005)

    Article  Google Scholar 

  3. Bertone, P., Stoc, V., Royce, T.E., Rozowsky, J.S., Urban, A.E., Zhu, X., Rinn, J.L., Tongprasit, W., Samanta, M., Weissman, S., Gerstein, M., Snyder, M.: Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004)

    Article  Google Scholar 

  4. Kapranov, P., Willingham, A.T., Gingeras, T.R.: Genome-wide transcription and the implications for genomic organization. Nat. Rev. Genet. 8(6), 413–423 (2007)

    Article  Google Scholar 

  5. Mattick, J.S., Taft, R.J., Faulkner, G.J.: A global view of genomic information - moving beyond the gene and the master regulator. Trends in Genetics (2009)

    Google Scholar 

  6. Consortium, A.F.B., Backofen, R., Bernhart, S.H., Flamm, C., Fried, C., Fritzsch, G., Hackermuller, J., Hertel, J., Hofacker, I.L., Missal, K., Mosig, A., Prohaska, S.J., Rose, D., Stadler, P.F., Tanzer, A., Washietl, S., Will, S.: RNAs everywhere: genome-wide annotation of structured RNAs. J. Exp. Zoolog. B. Mol. Dev. Evol. 308(1), 1–25 (2007)

    Google Scholar 

  7. Rivas, E., Eddy, S.R.: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2(1), 8 (2001)

    Article  Google Scholar 

  8. Washietl, S., Hofacker, I.L.: Identifying structural noncoding RNAs using RNAz. In: Curr. Protoc. Bioinformatics, ch.12, Unit 12.7 (2007)

    Google Scholar 

  9. Pedersen, J.S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E.S., Kent, J., Miller, W., Haussler, D.: Identification and Classification of Conserved RNA Secondary Structures in the Human Genome. PLoS Comput. Biol. 2(4), e33 (2006)

    Article  Google Scholar 

  10. Will, S., Reiche, K., Hofacker, I.L., Stadler, P.F., Backofen, R.: Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering. PLOS Computational Biology 3(4), e65 (2007)

    Article  MathSciNet  Google Scholar 

  11. Kaczkowski, B., Torarinsson, E., Reiche, K., Havgaard, J.H., Stadler, P.F., Gorodkin, J.: Structural profiles of human miRNA families from pairwise clustering. Bioinformatics 25(3), 291–294 (2009)

    Article  Google Scholar 

  12. Parker, B.J., Moltke, I., Roth, A., Washietl, S., Wen, J., Kellis, M., Breaker, R., Pedersen, J.S.: New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes. Genome Res. (2011)

    Google Scholar 

  13. Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. In: Proceedings of Computational Systems Bioinformatics (CSB 2003), vol. 2, pp. 159–168. IEEE Computer Society (2003)

    Google Scholar 

  14. Siebert, S., Backofen, R.: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics 21(16), 3352–3359 (2005)

    Article  Google Scholar 

  15. Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45(5), 810–825 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  16. Havgaard, J.H., Lyngso, R.B., Stormo, G.D., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9), 1815–1824 (2005)

    Article  Google Scholar 

  17. Mathews, D.H., Turner, D.H.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317(2), 191–203 (2002)

    Article  Google Scholar 

  18. Hofacker, I.L., Bernhart, S.H., Stadler, P.F.: Alignment of RNA base pairing probability matrices. Bioinformatics 20(14), 2222–2227 (2004)

    Article  Google Scholar 

  19. McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6-7), 1105–1119 (1990)

    Article  Google Scholar 

  20. Gorodkin, J., Heyer, L., Stormo, G.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucleic Acids Res. 25(18), 3724–3732 (1997)

    Article  Google Scholar 

  21. Bradley, R.K., Pachter, L., Holmes, I.: Specific alignment of structured RNA: stochastic grammars and sequence annealing. Bioinformatics 24(23), 2677–2683 (2008)

    Article  Google Scholar 

  22. Torarinsson, E., Havgaard, J.H., Gorodkin, J.: Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8), 926–932 (2007)

    Article  Google Scholar 

  23. Bauer, M., Klau, G.W., Reinert, K.: Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics 8, 271 (2007)

    Article  Google Scholar 

  24. Do, C.B., Foo, C.S., Batzoglou, S.: A max-margin model for efficient simultaneous alignment and folding of RNA sequences. Bioinformatics 24(13), i68–i76 (2008)

    Article  Google Scholar 

  25. Heyne, S., Will, S., Beckstette, M., Backofen, R.: Lightweight comparison of RNAs based on exact sequence-structure matches. Bioinformatics 25(16), 2095–2102 (2009)

    Article  Google Scholar 

  26. Backofen, R., Siebert, S.: Fast detection of common sequence structure patterns in RNAs. Journal of Discrete Algorithms 5(2), 212–228 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  27. Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. Journal of Computational Biology 14(6), 856–872 (2007)

    Article  MathSciNet  Google Scholar 

  28. Havgaard, J.H., Torarinsson, E., Gorodkin, J.: Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput. Biol. 3(10), 1896–1908 (2007)

    Article  Google Scholar 

  29. Ziv-Ukelson, M., Gat-Viks, I., Wexler, Y., Shamir, R.: A Faster Algorithm for RNA Co-folding. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 174–185. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  30. Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA Folding: Time and Space Efficient Algorithms. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 249–262. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  31. Salari, R., Möhl, M., Will, S., Sahinalp, S.C., Backofen, R.: Time and Space Efficient RNA-RNA Interaction Prediction via Sparse Folding. In: Berger, B. (ed.) RECOMB 2010. LNCS, vol. 6044, pp. 473–490. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  32. Backofen, R., Will, S.: Local sequence-structure motifs in RNA. Journal of Bioinformatics and Computational Biology (JBCB) 2(4), 681–698 (2004)

    Article  Google Scholar 

  33. Otto, W., Will, S., Backofen, R.: Structure local multiple alignment of RNA. In: Proceedings of German Conference on Bioinformatics (GCB 2008). LNI, Gesellschaft für Informatik (GI), vol. P-136, pp. 178–188 (2008)

    Google Scholar 

  34. Wilm, A., Mainz, I., Steger, G.: An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol. Biol. 1, 19 (2006)

    Article  Google Scholar 

  35. Gardner, P.P., Wilm, A., Washietl, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Research 33(8), 2433–2439 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schmiedl, C. et al. (2012). Exact Pattern Matching for RNA Structure Ensembles. In: Chor, B. (eds) Research in Computational Molecular Biology. RECOMB 2012. Lecture Notes in Computer Science(), vol 7262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29627-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29627-7_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29626-0

  • Online ISBN: 978-3-642-29627-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics