Abstract
Most models for pseudoknotted RNA structures can be described by multi-context free grammars (MCFGs) and thus are amenable to dynamic programming algorithms. They differ strongly in their definition of admissible structures and thus the search space over which structures are optimized. The accuracy of structure prediction can be expected to depend on choice of the MCFG: models that are too inclusive likely over-predict pseudoknots, while restrictive models by their definition already exclude more complex pseudoknotted structures. A systematic analysis of the impact of the grammar, however, is difficult since available implementations use incomparable energy parameters. We show here that Algebraic Dynamic Programming over MCFGs naturally disentangles energy models (as specified by the evaluation algebra) and the definition of search space defined by a MCFG. Preliminary computational experiments indicate that the choice of the grammar has an important impact already for short RNA sequences.
This work was funded by the German DFG Collaborative Research Centre AquaDiva (CRC 1076 AquaDiva), the German state of Thuringia via the Thüringer Aufbaubank (2021 FGI 0009), the Carl-Zeiss-Stiftung within the program Scientific Breakthroughs in Artificial Intelligence (project “Interactive Inference”), and the German Federal Ministry of Education and Research (BMBF 031L0164C “RNAProNet”).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Since we use here an energy model that is slightly simplified in the evaluation of certain loop terms compared to the full model implemented in ViennaRNA, occasionally we predict structures that are closer to structure model in the STRAND database and thus accuracy may also be (slightly) better than the ViennaRNA predictions.
References
Akutsu, T.: Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discr. Appl. Math. 104, 45–62 (2000). https://doi.org/10.1016/S0166-218X(00)00186-4
Andronescu, M., Bereg, V., Hoos, H.H., Condon, A.: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 9, 340 (2008). https://doi.org/10.1186/1471-2105-9-340
Brierley, I., Pennell, S., Gilbert, R.J.: Viral RNA pseudoknots: versatile motifs in gene expression and replication. Nat. Rev. Microbiol. 5, 598–610 (2007). https://doi.org/10.1038/nrmicro1704
Condon, A., Davy, B., Rastegari, B., Zhao, S., Tarrant, F.: Classifying RNA pseudoknotted structures. Theor. Comp. Sci. 320, 35–50 (2004). https://doi.org/10.1016/j.tcs.2004.03.042
Dirks, R.M., Pierce, N.A.: A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 24, 1664–1677 (2003). https://doi.org/10.1002/jcc.10296
Giegerich, R., Meyer, C.: Algebraic dynamic programming. In: Kirchner, H., Ringeissen, C. (eds.) Algebraic Methodology And Software Technology (AMAST 2002), vol. 2422, pp. 243–257. Springer, Berlin (2002). https://doi.org/10.5555/646061.676145
Giegerich, R., Meyer, C., Steffen, P.: A discipline of dynamic programming over sequence data. Sci. Comput. Prog. 51, 215–263 (2004). https://doi.org/10.1016/j.scico.2003.12.005
Giegerich, R., Touzet, H.: Modeling dynamic programming problems over sequences and trees with inverse coupled rewrite systems. Algorithms 7, 62–144 (2014). https://doi.org/10.3390/a7010062
Lorenz, R., et al.: ViennaRNA package 2.0. Alg. Mol. Biol. 6, 26 (2011). https://doi.org/10.1186/1748-7188-6-26
Lyngsø, R.B., Pedersen, C.N.: RNA pseudoknot prediction in energy-based models. J. Comp. Biol. 7, 409–427 (2000). https://doi.org/10.1089/106652700750050862
Lyngsø, R.B., Pedersen, C.N.: Pseudoknots in RNA secondary structures. In: Shamir, R., Miyano, S., Sorin, I. (eds.) RECOMB 2000: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, pp. 201–209. ACM, New York (2000). https://doi.org/10.1145/332306.332551
Nebel, M.E., Weinberg, F.: Algebraic and combinatorial properties of common RNA pseudoknot classes with applications. J. Comp. Biol. 19, 1134–1150 (2012). https://doi.org/10.1089/cmb.2011.0094
Ponty, Y., Saule, C.: A combinatorial framework for designing (pseudoknotted) RNA algorithms. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS, vol. 6833, pp. 250–269. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23038-7_22
Reeder, J., Giegerich, R.: Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinf. 5, 104 (2004). https://doi.org/10.1186/1471-2105-5-104
Reidys, C.M., Huang, F.W.D., Andersen, J.E., Penner, R.C., Stadler, P.F., Nebel, M.E.: Topology and prediction of RNA pseudoknots. Bioinformatics 27, 1076–1085 (2011). https://doi.org/10.1093/bioinformatics/btr090, addendum. In: Bioinformatics 28:300 (2012)
Riechert, M., Höner zu Siederdissen, C., Stadler, P.F. Algebraic dynamic programming for multiple context-free grammars. Theor. Comp. Sci. 639, 91–109 (2016). https://doi.org/10.1016/j.tcs.2016.05.032
Rivas, E., Eddy, S.R.: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285, 2053–2068 (1999). https://doi.org/10.1006/jmbi.1998.2436
Rivas, E., Lang, R., Eddy, S.R.: A range of complex probabilistic models for RNA secondary structure prediction that include the nearest neighbor model and more. RNA 18, 193–212 (2012). https://doi.org/10.1261/rna.030049.111
Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context free grammars. Theor. Comp. Sci. 88, 191–229 (1991). https://doi.org/10.1016/0304-3975(91)90374-B
Sheikh, S., Backofen, R., Ponty, Y.: Impact of the energy model on the complexity of RNA folding with pseudoknots. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 321–333. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31265-6_26
Höner zu Siederdissen, C.: Sneaking around concatMap: efficient combinators for dynamic programming. In: Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, ICFP 2012, pp. 215–226. ACM, New York (2012). https://doi.org/10.1145/2364527.2364559
Höner zu Siederdissen, C., Hofacker, I.L., Stadler, P.F.: Product grammars for alignment and folding. IEEE/ACM Trans. Comp. Biol. Bioinf. 12, 507–519 (2014). https://doi.org/10.1109/TCBB.2014.2326155
Höner zu Siederdissen, C., Prohaska, S.J., Stadler, P.F.: Algebraic dynamic programming over general data structures. BMC Bioinf. 16, S2 (2015). https://doi.org/10.1186/1471-2105-16-S19-S2
Staple, D.W., Butcher, S.E.: Pseudoknots: RNA structures with diverse functions. PLoS Comp. Biol. 3, e213 (2005). https://doi.org/10.1371/journal.pbio.0030213
Steffen, P., Giegerich, R.: Versatile and declarative dynamic programming using pair algebras. BMC Bioinf. 6, 224 (2005). https://doi.org/10.1186/1471-2105-6-224
Taufer, M., et al.: PseudoBase++: an extension of PseudoBase for easy searching, formatting, and visualization of pseudoknots. Nucl. Acids Res. 37, D127–D135 (2009). https://doi.org/10.1093/nar/gkn806
Turner, D.H., Mathews, D.H.: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucl. Acids Res. 38, D280–D282 (2010). https://doi.org/10.1093/nar/gkp892
Ward, M., Datta, A., Wise, M., Mathews, D.H.: Advanced multi-loop algorithms for RNA secondary structure prediction reveal that the simplest model is best. Nucl. Acids Res. 45, 8541–8550 (2017). https://doi.org/10.1093/nar/gkx512
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Eggers, D., Höner zu Siederdissen, C., Stadler, P.F. (2022). Accuracy of RNA Structure Prediction Depends on the Pseudoknot Grammar. In: Scherer, N.M., de Melo-Minardi, R.C. (eds) Advances in Bioinformatics and Computational Biology. BSB 2022. Lecture Notes in Computer Science(), vol 13523. Springer, Cham. https://doi.org/10.1007/978-3-031-21175-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-21175-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21174-4
Online ISBN: 978-3-031-21175-1
eBook Packages: Computer ScienceComputer Science (R0)