Skip to main content

A Non-parametric Bayesian Approach for Predicting RNA Secondary Structures

  • Conference paper
Algorithms in Bioinformatics (WABI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5724))

Included in the following conference series:

Abstract

Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9(1), 133–148 (1981)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hofacker, I.L.: Vienna RNA secondary structure server. Nucleic Acids Res. 31(13), 3429–3431 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Eddy, S.R., Durbin, R.: RNA sequence analysis using covariance models. Nucleic Acids Res. 22(11), 2079–2088 (1994)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjölander, K., Underwood, R.C., Haussler, D.: Stochastic context-free grammars for tRNA modeling. Nucleic Acids Res. 22(23), 5112–5120 (1994)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Knudsen, B., Hein, J.: RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 15(6), 446–454 (1999)

    Article  CAS  PubMed  Google Scholar 

  6. Rivas, E., Eddy, S.R.: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2, 8 (2001)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Eddy, S.R.: A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinformatics 3, 18 (2002)

    Article  PubMed  PubMed Central  Google Scholar 

  8. Sakakibara, Y.: Pair hidden Markov models on tree structures. Bioinformatics 19(suppl. 1), i232–i240 (2003)

    Article  Google Scholar 

  9. Sato, K., Sakakibara, Y.: RNA secondary structural alignment with conditional random fields. Bioinformatics 21(suppl. 2), ii237–ii242 (2005)

    Google Scholar 

  10. Dowell, R.D., Eddy, S.R.: Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics 5, 71 (2004)

    Article  PubMed  PubMed Central  Google Scholar 

  11. Pedersen, J.S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E.S., Kent, J., Miller, W., Haussler, D.: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2(4), e33 (2006)

    Article  Google Scholar 

  12. Do, C.B., Woods, D.A., Batzoglou, S.: CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22(14), e90–e98 (2006)

    Article  Google Scholar 

  13. Rivas, E., Eddy, S.R.: Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics 16(7), 583–605 (2000)

    Article  CAS  PubMed  Google Scholar 

  14. Liang, P., Petrov, S., Jordan, M.I., Klein, D.: The infinite PCFG using hierarchical Dirichlet processes. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 688–697 (2007)

    Google Scholar 

  15. Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis. Cambridge University Press, Cambridge (1998)

    Book  Google Scholar 

  16. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. Journal of the American Statistical Association 101, 1566–1581 (2006)

    Article  CAS  Google Scholar 

  17. Blei, D.M., Jordan, M.I.: Variational inference for Dirichlet process mixtures. Bayesian Analysis 1, 121–144 (2005)

    Article  Google Scholar 

  18. Hamada, M., Kiryu, H., Sato, K., Mituyama, T., Asai, K.: Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics 25(4), 465–473 (2009)

    Article  CAS  PubMed  Google Scholar 

  19. Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33(Database issue), D121–D124 (2005)

    Article  Google Scholar 

  20. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, pp. 282–289 (2001)

    Google Scholar 

  21. McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6-7), 1105–1119 (1990)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sato, K., Hamada, M., Mituyama, T., Asai, K., Sakakibara, Y. (2009). A Non-parametric Bayesian Approach for Predicting RNA Secondary Structures. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04241-6_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04240-9

  • Online ISBN: 978-3-642-04241-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics