Skip to main content

Non-canonical Imperfect Base Pair Predictor: The RNA 3D Structure Modeling Process Improvement

  • Conference paper
Bioinformatics and Biomedical Engineering (IWBBIO 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9043))

Included in the following conference series:

  • 2489 Accesses

Abstract

RNA is a large group of macromolecules involved in many essential cellular processes. They can form complex secondary and three-dimensional structures, and their biological functions highly rely on their forms. Therefore a high quality RNA structure determination is a key process to address RNA functions and roles in molecular pathways. However, in many cases the structure cannot be experimentally solved or the process is too expensive and laborious. This problem can be avoid using bioinformatics methods of computational RNA structure prediction. Such applications have been developed, however the quality of predictions, especially for large RNA structures, still remains too low.

One of the most important aspects in RNA 3D model building is the intramolecular interactions identification and validation. In this work I propose a method which can improve this stage of model building, and should result in creation of better final three-dimensional RNA models.

In my work I constructed a predictor that can identify both canonical and non-canonical base pair interactions within a given structure. The main advantages of this predictor are: 1) the ability to work on incomplete input structures, and 2) the ability to correctly predict base pair type even for imperfect (fuzzy) input atoms coordinates.

The predictor is based on the set of SVM multi-class classifiers. For each input base pair the classifier chooses one of 18 recognized pair types. The predictor was trained on the experimental high quality data and tested on different, imperfect and incomplete (coarse-grained) structures. The average quality of predictor for tested fuzzy nucleotide pairs is at the level about 96% of correct recognitions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berman, H.M., Westbrook, J., Feng, Z., et al.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)

    Article  Google Scholar 

  2. Boniecki, M.J., Łach, G., Tomala, K., et al.: SimRNA: A program for RNA folding simulations. In: SocBiN/BIT13 Book of Abstracts, Torun, Poland, June 26-29 (2013)

    Google Scholar 

  3. Clancy, S.: RNA functions. Nature Education 1(1), 102 (2008)

    Google Scholar 

  4. Cock, P.J., Antao, T., Chang, J.T., et al.: Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25(11), 1422–1423 (2009)

    Article  Google Scholar 

  5. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  6. Cruz, J.A., Blanchet, M.-F., Boniecki, M., et al.: RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction. RNA 18(4), 610–625 (2012)

    Article  Google Scholar 

  7. Das, R., Baker, D.: Automated de novo prediction of native-like RNA tertiary structures. Proc. Natl. Acad. Sci. U.S.A. 104(37), 14664–14669 (2007)

    Article  Google Scholar 

  8. Flores, S.C., Altman, R.B.: Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. RNA 15(9), 1769–1778 (2010)

    Article  Google Scholar 

  9. Gendron, P., Lemieux, S., Major, F.: Quantitative analysis of nucleic acid three-dimensional structures. J. Mol. Biol. 308, 919–936 (2001)

    Article  Google Scholar 

  10. Gesteland, R.F. (ed.): The RNA World, 3rd edn. Cold Spring Harbor Laboratory Press (2005)

    Google Scholar 

  11. Halder, S., Bhattacharyya, D.: RNA structure and dynamics: a base pairing perspective. Prog. Biophys. Mol. Biol. 113, 264–283 (2013)

    Article  Google Scholar 

  12. Hamelryck, T., Manderick, B.: PDB file parser and structure class implemented in Python. Bioinformatics 19(17), 2308–2310 (2003)

    Article  Google Scholar 

  13. Jensen, L.J., Bateman, A.: The rise and fall of supervised machine learning techniques. Bioinformatics 27(24), 3331–3332 (2011)

    Article  Google Scholar 

  14. Knerr, S., Personnaz, L., Dreyfus, G.: Single-layer learning revisited: A stepwise procedure for building and training neural network. In: Neurocomputing: Algorithms, Architectures and Applications. NATO ASI, Springer, Berlin (1990)

    Google Scholar 

  15. Lee, J.C., Gutell, R.R.: Diversity of base-pair conformations and their occurrence in rRNA structure and RNA structural motifs. J. Mol. Biol. 344, 1225–1249 (2004)

    Article  Google Scholar 

  16. Leontis, N.B., Lescoute, A., Westhof, E.: The building blocks and motifs of RNA architecture. Curr. Opin. Struct. Biol. 16, 279–287 (2006)

    Article  Google Scholar 

  17. Leontis, N.B., Stombaugh, J., Westhof, E.: The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res 30(16), 3497–3531 (2002)

    Article  Google Scholar 

  18. Leontis, N.B., Westhof, E.: Geometric nomenclature and classification of RNA base pairs. RNA 7, 499–512 (2001)

    Article  Google Scholar 

  19. Mirmohammadi, S.N., Shishehgar, M., Ghapanchi, F.: Applications of ANNs, SVM, MDR and FR Methods in Bioinformatics. World Applied Sciences Journal 31(6), 1109–1117 (2014)

    Google Scholar 

  20. Moon, P., Spencer, D.E.: Spherical coordinates (r, θ, ψ). In: Field Theory Handbook, Including Coordinate Systems, Differential Equations, and Their Solutions, pp. 24–27. Springer, New York (1988)

    Google Scholar 

  21. Oliphant, T.E.: Python for Scientific Computing. Computing in Science & Engineering 9, 90 (2007)

    Article  Google Scholar 

  22. Parisien, M., Major, F.: The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 452(1), 51–55 (2008)

    Article  Google Scholar 

  23. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: Machine Learning in Python. JMLR 12, 2825–2830 (2011)

    MATH  MathSciNet  Google Scholar 

  24. Platt, J.C., Cristianini, N., Shawe-Taylor, J.: Large margin DAGs for multiclass classification. In: Advances in Neural Information Processing Systems, vol. 12, pp. 547–553. MIT Press (2000)

    Google Scholar 

  25. Popenda, M., Szachniuk, M., Antczak, M., et al.: Automated 3D structure composition for large RNAs. Nucleic Acids Res. 40(14), 1–12 (2012)

    Article  Google Scholar 

  26. Rother, K., Rother, M., Boniecki, M., et al.: RNA and protein 3D structure modeling: similarities and differences. J. Mol. Model. 17(9), 2325–2336 (2011)

    Article  Google Scholar 

  27. Rother, M., Rother, K., Puton, T., Bujnicki, J.M.: ModeRNA: A tool for comparative modeling of RNA 3D structure. Nucleic Acids Res. 39(10), 4007–4022 (2011)

    Article  Google Scholar 

  28. Sarver, M., Zirbel, C.L., Stombaugh, J., et al.: FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J. Math. Biol. 56, 215–252 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  29. Sharma, S., Ding, F., Dokholyan, N.V.: iFoldRNA: Three-dimensional RNA structure prediction and folding. Bioinformatics 24(17), 1951–1952 (2008)

    Article  Google Scholar 

  30. Sripakdeevong, P., Beauchamp, K., Das, R.: Why can’t we predict RNA structure at atomic resolution? In: Leontis, N., Westhof, E. (eds.) RNA 3D Structure Analysis and Prediction, Nucleic Acids and Molecular Biology 27, 43–65 (2012)

    Google Scholar 

  31. Waleń, T., Chojnowski, G., Gierski, P., Bujnicki, J.M.: ClaRNA: a classifier of contacts in RNA 3D structures based on a comparative analysis of various classification schemes. Nucleic Acids Research 42(19), e151 (2014)

    Google Scholar 

  32. Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability Estimates for Multi-class Classification by Pairwise Coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)

    MATH  MathSciNet  Google Scholar 

  33. Yang, H., Jossinet, F., Leontis, N., et al.: Tools for the automatic identification and classification of RNA base pairs. Nucleic Acids Res. 31, 3450–3460 (2003)

    Article  Google Scholar 

  34. Yang, Z.R.: Biological applications of support vector machines. Brief. Bioinform. 5(4), 328–338 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Śmietański, J. (2015). Non-canonical Imperfect Base Pair Predictor: The RNA 3D Structure Modeling Process Improvement. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16483-0_64

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16482-3

  • Online ISBN: 978-3-319-16483-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics