Skip to main content

Accuracy of RNA Structure Prediction Depends on the Pseudoknot Grammar

  • Conference paper
  • First Online:
Advances in Bioinformatics and Computational Biology (BSB 2022)

Abstract

Most models for pseudoknotted RNA structures can be described by multi-context free grammars (MCFGs) and thus are amenable to dynamic programming algorithms. They differ strongly in their definition of admissible structures and thus the search space over which structures are optimized. The accuracy of structure prediction can be expected to depend on choice of the MCFG: models that are too inclusive likely over-predict pseudoknots, while restrictive models by their definition already exclude more complex pseudoknotted structures. A systematic analysis of the impact of the grammar, however, is difficult since available implementations use incomparable energy parameters. We show here that Algebraic Dynamic Programming over MCFGs naturally disentangles energy models (as specified by the evaluation algebra) and the definition of search space defined by a MCFG. Preliminary computational experiments indicate that the choice of the grammar has an important impact already for short RNA sequences.

This work was funded by the German DFG Collaborative Research Centre AquaDiva (CRC 1076 AquaDiva), the German state of Thuringia via the Thüringer Aufbaubank (2021 FGI 0009), the Carl-Zeiss-Stiftung within the program Scientific Breakthroughs in Artificial Intelligence (project “Interactive Inference”), and the German Federal Ministry of Education and Research (BMBF 031L0164C “RNAProNet”).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Since we use here an energy model that is slightly simplified in the evaluation of certain loop terms compared to the full model implemented in ViennaRNA, occasionally we predict structures that are closer to structure model in the STRAND database and thus accuracy may also be (slightly) better than the ViennaRNA predictions.

References

  1. Akutsu, T.: Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discr. Appl. Math. 104, 45–62 (2000). https://doi.org/10.1016/S0166-218X(00)00186-4

    Article  Google Scholar 

  2. Andronescu, M., Bereg, V., Hoos, H.H., Condon, A.: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 9, 340 (2008). https://doi.org/10.1186/1471-2105-9-340

    Article  CAS  Google Scholar 

  3. Brierley, I., Pennell, S., Gilbert, R.J.: Viral RNA pseudoknots: versatile motifs in gene expression and replication. Nat. Rev. Microbiol. 5, 598–610 (2007). https://doi.org/10.1038/nrmicro1704

    Article  CAS  Google Scholar 

  4. Condon, A., Davy, B., Rastegari, B., Zhao, S., Tarrant, F.: Classifying RNA pseudoknotted structures. Theor. Comp. Sci. 320, 35–50 (2004). https://doi.org/10.1016/j.tcs.2004.03.042

    Article  Google Scholar 

  5. Dirks, R.M., Pierce, N.A.: A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 24, 1664–1677 (2003). https://doi.org/10.1002/jcc.10296

    Article  CAS  Google Scholar 

  6. Giegerich, R., Meyer, C.: Algebraic dynamic programming. In: Kirchner, H., Ringeissen, C. (eds.) Algebraic Methodology And Software Technology (AMAST 2002), vol. 2422, pp. 243–257. Springer, Berlin (2002). https://doi.org/10.5555/646061.676145

  7. Giegerich, R., Meyer, C., Steffen, P.: A discipline of dynamic programming over sequence data. Sci. Comput. Prog. 51, 215–263 (2004). https://doi.org/10.1016/j.scico.2003.12.005

    Article  Google Scholar 

  8. Giegerich, R., Touzet, H.: Modeling dynamic programming problems over sequences and trees with inverse coupled rewrite systems. Algorithms 7, 62–144 (2014). https://doi.org/10.3390/a7010062

    Article  Google Scholar 

  9. Lorenz, R., et al.: ViennaRNA package 2.0. Alg. Mol. Biol. 6, 26 (2011). https://doi.org/10.1186/1748-7188-6-26

  10. Lyngsø, R.B., Pedersen, C.N.: RNA pseudoknot prediction in energy-based models. J. Comp. Biol. 7, 409–427 (2000). https://doi.org/10.1089/106652700750050862

    Article  Google Scholar 

  11. Lyngsø, R.B., Pedersen, C.N.: Pseudoknots in RNA secondary structures. In: Shamir, R., Miyano, S., Sorin, I. (eds.) RECOMB 2000: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, pp. 201–209. ACM, New York (2000). https://doi.org/10.1145/332306.332551

  12. Nebel, M.E., Weinberg, F.: Algebraic and combinatorial properties of common RNA pseudoknot classes with applications. J. Comp. Biol. 19, 1134–1150 (2012). https://doi.org/10.1089/cmb.2011.0094

    Article  CAS  Google Scholar 

  13. Ponty, Y., Saule, C.: A combinatorial framework for designing (pseudoknotted) RNA algorithms. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS, vol. 6833, pp. 250–269. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23038-7_22

    Chapter  Google Scholar 

  14. Reeder, J., Giegerich, R.: Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinf. 5, 104 (2004). https://doi.org/10.1186/1471-2105-5-104

    Article  CAS  Google Scholar 

  15. Reidys, C.M., Huang, F.W.D., Andersen, J.E., Penner, R.C., Stadler, P.F., Nebel, M.E.: Topology and prediction of RNA pseudoknots. Bioinformatics 27, 1076–1085 (2011). https://doi.org/10.1093/bioinformatics/btr090, addendum. In: Bioinformatics 28:300 (2012)

  16. Riechert, M., Höner zu Siederdissen, C., Stadler, P.F. Algebraic dynamic programming for multiple context-free grammars. Theor. Comp. Sci. 639, 91–109 (2016). https://doi.org/10.1016/j.tcs.2016.05.032

  17. Rivas, E., Eddy, S.R.: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285, 2053–2068 (1999). https://doi.org/10.1006/jmbi.1998.2436

    Article  CAS  Google Scholar 

  18. Rivas, E., Lang, R., Eddy, S.R.: A range of complex probabilistic models for RNA secondary structure prediction that include the nearest neighbor model and more. RNA 18, 193–212 (2012). https://doi.org/10.1261/rna.030049.111

    Article  CAS  Google Scholar 

  19. Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context free grammars. Theor. Comp. Sci. 88, 191–229 (1991). https://doi.org/10.1016/0304-3975(91)90374-B

    Article  Google Scholar 

  20. Sheikh, S., Backofen, R., Ponty, Y.: Impact of the energy model on the complexity of RNA folding with pseudoknots. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 321–333. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31265-6_26

    Chapter  Google Scholar 

  21. Höner zu Siederdissen, C.: Sneaking around concatMap: efficient combinators for dynamic programming. In: Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, ICFP 2012, pp. 215–226. ACM, New York (2012). https://doi.org/10.1145/2364527.2364559

  22. Höner zu Siederdissen, C., Hofacker, I.L., Stadler, P.F.: Product grammars for alignment and folding. IEEE/ACM Trans. Comp. Biol. Bioinf. 12, 507–519 (2014). https://doi.org/10.1109/TCBB.2014.2326155

  23. Höner zu Siederdissen, C., Prohaska, S.J., Stadler, P.F.: Algebraic dynamic programming over general data structures. BMC Bioinf. 16, S2 (2015). https://doi.org/10.1186/1471-2105-16-S19-S2

  24. Staple, D.W., Butcher, S.E.: Pseudoknots: RNA structures with diverse functions. PLoS Comp. Biol. 3, e213 (2005). https://doi.org/10.1371/journal.pbio.0030213

    Article  CAS  Google Scholar 

  25. Steffen, P., Giegerich, R.: Versatile and declarative dynamic programming using pair algebras. BMC Bioinf. 6, 224 (2005). https://doi.org/10.1186/1471-2105-6-224

    Article  CAS  Google Scholar 

  26. Taufer, M., et al.: PseudoBase++: an extension of PseudoBase for easy searching, formatting, and visualization of pseudoknots. Nucl. Acids Res. 37, D127–D135 (2009). https://doi.org/10.1093/nar/gkn806

    Article  CAS  Google Scholar 

  27. Turner, D.H., Mathews, D.H.: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucl. Acids Res. 38, D280–D282 (2010). https://doi.org/10.1093/nar/gkp892

    Article  CAS  Google Scholar 

  28. Ward, M., Datta, A., Wise, M., Mathews, D.H.: Advanced multi-loop algorithms for RNA secondary structure prediction reveal that the simplest model is best. Nucl. Acids Res. 45, 8541–8550 (2017). https://doi.org/10.1093/nar/gkx512

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter F. Stadler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Eggers, D., Höner zu Siederdissen, C., Stadler, P.F. (2022). Accuracy of RNA Structure Prediction Depends on the Pseudoknot Grammar. In: Scherer, N.M., de Melo-Minardi, R.C. (eds) Advances in Bioinformatics and Computational Biology. BSB 2022. Lecture Notes in Computer Science(), vol 13523. Springer, Cham. https://doi.org/10.1007/978-3-031-21175-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21175-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21174-4

  • Online ISBN: 978-3-031-21175-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics