Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Deep generative design of RNA aptamers using structural predictions

Abstract

RNAs represent a class of programmable biomolecules capable of performing diverse biological functions. Recent studies have developed accurate RNA three-dimensional structure prediction methods, which may enable new RNAs to be designed in a structure-guided manner. Here, we develop a structure-to-sequence deep learning platform for the de novo generative design of RNA aptamers. We show that our approach can design RNA aptamers that are predicted to be structurally similar, yet sequence dissimilar, to known light-up aptamers that fluoresce in the presence of small molecules. We experimentally validate several generated RNA aptamers to have fluorescent activity, show that these aptamers can be optimized for activity in silico, and find that they exhibit a mechanism of fluorescence similar to that of known light-up aptamers. Our results demonstrate how structural predictions can guide the targeted and resource-efficient design of new RNA sequences.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: A deep learning approach for the 3D-structure-based generative design of RNAs.
Fig. 2: Experimental validation, optimization and mechanism of fluorescence of generated light-up RNA aptamers.

Similar content being viewed by others

Data availability

The numerical data supporting the findings of this paper are provided in the Source Data, and the sequences can be generated by running RhoDesign. Source Data for Figs. 1 and 2 are available. Sequences generated from RhoDesign and accompanying data are available as Supplementary Data 1. The training dataset and model checkpoints for RhoDesign are available from Zenodo60. The PDB structure for Mango-III (A10U), 6UP0, is available from the PDB61.

Code availability

RhoDesign is available at https://github.com/ml4bio/RhoDesign and from Zenodo60.

References

  1. Cech, T. R., Zaug, A. J. & Grabowski, P. J. In vitro splicing of the ribosomal RNA precursor of Tetrahymena: involvement of a guanosine nucleotide in the excision of the intervening sequence. Cell 27, 487–496 (1981).

    Article  Google Scholar 

  2. Guerrier-Takada, C., Gardiner, K., Marsh, T., Pace, N. & Altman, S. The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell 35, 849–857 (1983).

    Article  Google Scholar 

  3. Statello, L., Guo, C.-J., Chen, L.-L. & Huarte, M. Gene regulation by long non-coding RNAs and its biological functions. Nat. Rev. Mol. Cell Biol. 22, 96–118 (2021).

    Article  Google Scholar 

  4. Dinger, M. E., Mercer, T. R. & Mattick, J. S. RNAs as extracellular signaling molecules. J. Mol. Endocrinol. 40, 151–159 (2008).

    Article  Google Scholar 

  5. Keefe, A. D., Pai, S. & Ellington, A. Aptamers as therapeutics. Nat. Rev. Drug. Discov. 9, 537–550 (2010).

    Article  Google Scholar 

  6. Tuerk, C., MacDougal, S. & Gold, L. RNA pseudoknots that inhibit human immunodeficiency virus type 1 reverse transcriptase. Proc. Natl. Acad. Sci. USA 89, 6988–6992 (1992).

    Article  Google Scholar 

  7. Pardee, K. et al. Rapid, low-cost detection of Zika virus using programmable biomolecular components. Cell 165, 1255–1266 (2016).

    Article  Google Scholar 

  8. Angenent-Mari, N. M., Garruss, A. S., Soenksen, L. R., Church, G. & Collins, J. J. A deep learning approach to programmable RNA switches. Nat. Commun. 11, 5057 (2020).

    Article  Google Scholar 

  9. Valeri, J. A. et al. Sequence-to-function deep learning frameworks for engineered riboregulators. Nat. Commun. 11, 5058 (2020).

    Article  Google Scholar 

  10. Takahashi, M. K. et al. A low-cost paper-based synthetic biology platform for analyzing gut microbiota and host biomarkers. Nat. Commun. 9, 3347 (2018).

    Article  Google Scholar 

  11. Green, A. A., Silver, P. A., Collins, J. J. & Yin, P. Toehold switches: de-novo-designed regulators of gene expression. Cell 159, 925–939 (2014).

    Article  Google Scholar 

  12. Paige, J. S., Wu, K. Y. & Jaffrey, S. R. RNA mimics of green fluorescent protein. Science 333, 642–646 (2011).

    Article  Google Scholar 

  13. Miao, Z. & Westhof, E. RNA structure: advances and assessment of 3D structure prediction. Annu. Rev. Biophys. 46, 483–503 (2017).

    Article  Google Scholar 

  14. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  15. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article  Google Scholar 

  16. Shen, T. et al. E2Efold-3D: end-to-end deep learning method for accurate de novo RNA 3D structure prediction. Preprint at https://arxiv.org/abs/2207.01586 (2022).

  17. Wang, W. et al. trRosettaRNA: automated prediction of RNA 3D structure with transformer network. Nat. Commun. 14, 7266 (2023).

    Article  Google Scholar 

  18. Pearce, R., Li, Y., Omenn, G. S. & Zhang, Y. Fast and accurate ab initio protein structure prediction using deep learning potentials. PLoS Comput. Biol. 18, e1010539 (2022).

    Article  Google Scholar 

  19. Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

    Article  Google Scholar 

  20. Das, R. et al. Assessment of three-dimensional RNA structure prediction in CASP15. Proteins 91, 1747–1770 (2023).

    Article  Google Scholar 

  21. Runge, F., Stoll, D., Falkner, S. & Hutter, F. Learning to design RNA. In International Conference on Learning Representations 2019 https://openreview.net/pdf?id=ByfyHh05tQ (ICLR, 2019).

  22. Wu, M. J., Andreasson, J. O. L., Kladwang, W., Greenleaf, W. & Das, R. Automated design of diverse stand-alone riboswitches. ACS Synth. Biol. 8, 1838–1846 (2019).

    Article  Google Scholar 

  23. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  Google Scholar 

  24. Jing, B. et al. Learning from protein structure with geometric vector perceptrons. In International Conference on Learning Representations https://openreview.net/pdf?id=1YLJDvSx6J4 (ICLR, 2021).

  25. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 5998–6008 (NIPS, 2017).

  26. Hsu, C. et al. Learning inverse folding from millions of predicted structures. Proc. Mach. Learn. Res. 162, 8946–8970 (2022).

    Google Scholar 

  27. Yang, X., Yoshizoe, K., Taneda, A. & Tsuda, K. RNA inverse folding using Monte Carlo tree search. BMC Bioinform. 18, 468 (2017).

    Article  Google Scholar 

  28. Joshi, C. K. & Liò, P. gRNAde: a geometric deep learning for 3D RNA inverse design. Methods Mol. Biol. 2847, 121–135 (2025).

    Article  Google Scholar 

  29. Tan, C. et al. RDesign: hierarchical data-efficient representation learning for tertiary structure-based RNA design. In The Twelfth International Conference on Learning Representations (ICLR, 2024).

  30. Rubio-Largo, Á., Lozano-García, N., Granado-Criado, J. & Vega-Rodríguez, M. A. Solving the RNA inverse folding problem through target structure decomposition and multiobjective evolutionary computation. Appl. Soft Comput. 147, 110779 (2023).

    Article  Google Scholar 

  31. Autour, A. et al. Fluorogenic RNA Mango aptamers for imaging small non-coding RNAs in mammalian cells. Nat. Commun. 9, 656 (2018).

    Article  Google Scholar 

  32. Jeng, S. C. Y. et al. Fluorogenic aptamers resolve the flexibility of RNA junctions using orientation-dependent FRET. RNA 27, 433–444 (2021).

    Article  Google Scholar 

  33. Iwano, N. et al. Generative aptamer discovery using RaptGen. Nat. Comput. Sci. 2, 378–386 (2022).

    Article  Google Scholar 

  34. Jiang, P. et al. MPBind: a meta-motif-based statistical framework and pipeline to predict binding potential of SELEX-derived aptamers. Bioinformatics 30, 2665–2667 (2014).

    Article  Google Scholar 

  35. Jeng, S. C., Chan, H. H., Booy, E. P., McKenna, S. A. & Unrau, P. J. Fluorophore ligand binding and complex stabilization of the RNA Mango and RNA Spinach aptamers. RNA 22, 1884–1892 (2016).

    Article  Google Scholar 

  36. Trachman, R. J. III et al. Structural basis for high-affinity fluorophore binding and activation by RNA Mango. Nat. Chem. Biol. 13, 807–813 (2017).

    Article  Google Scholar 

  37. Liu, L. Y., Ma, T. Z., Zeng, Y. L., Liu, W. & Mao, Z. W. Structural basis of pyridostatin and its derivatives specifically binding to G-quadruplexes. J. Am. Chem. Soc. 144, 11878–11887 (2022).

    Article  Google Scholar 

  38. Han, F. X., Wheelhouse, R. T. & Hurley, L. H. Interactions of TMPyP4 and TMPyP2 with quadruplex DNA. Structural basis for the differential effects on telomerase inhibition. J. Am. Chem. Soc. 121, 3561–3570 (1999).

    Article  Google Scholar 

  39. Rocca, R. et al. Molecular recognition of a carboxy pyridostatin toward G-quadruplex structures: why does it prefer RNA? Chem. Biol. Drug Des. 90, 919–925 (2017).

    Article  Google Scholar 

  40. Chen, X. C. et al. Tracking the dynamic folding and unfolding of RNA G-quadruplexes in live cells. Angew. Chem. Int. Ed. Engl. 57, 4702–4706 (2018).

    Article  Google Scholar 

  41. Ellington, A. D. & Szostak, J. W. In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818–822 (1990).

    Article  Google Scholar 

  42. Lu, X. J., Bussemaker, H. J. & Olson, W. K. DSSR: an integrated software tool for dissecting the spatial structure of RNA. Nucleic Acids Res. 43, e142 (2015).

    Google Scholar 

  43. The RNAcentral Consortium. RNAcentral: a hub of information for non-coding RNA sequences. Nucleic Acids Res. 47, D221–D229 (2019).

    Article  Google Scholar 

  44. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  Google Scholar 

  45. Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682 (2010).

    Article  Google Scholar 

  46. Zhang, C., Shine, M., Pyle, A. M. & Zhang, Y. US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes. Nat. Methods 19, 1109–1115 (2022).

    Article  Google Scholar 

  47. Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).

    Article  Google Scholar 

  48. Boniecki, M. J. et al. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res. 44, e63 (2016).

    Article  Google Scholar 

  49. Li, Y. et al. Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction. Nat. Commun. 14, 5745 (2023).

    Article  Google Scholar 

  50. Biesiada, M. et al. Automated RNA 3D structure prediction with RNAComposer. Methods Mol. Biol. 1490, 199–215 (2016).

    Article  Google Scholar 

  51. Baek, M. et al. Accurate prediction of protein–nucleic acid complexes using RoseTTAFoldNA. Nat. Methods 21, 117–121 (2024).

    Article  Google Scholar 

  52. Case, D. A. et al. AmberTools. J. Chem. Inf. Model. 63, 6183–6191 (2023).

    Article  Google Scholar 

  53. Zok, T. et al. RNApdbee 2.0: multifunctional tool for RNA structure annotation. Nucleic Acids Res. 46, W30–W35 (2018).

    Article  Google Scholar 

  54. Fu, L. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res. 50, e14 (2022).

    Article  Google Scholar 

  55. Lorenz, R. et al. ViennaRNA package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    Article  Google Scholar 

  56. Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).

    Article  Google Scholar 

  57. Chen, J. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. Preprint at https://arxiv.org/abs/2204.00300 (2022).

  58. Wong, F. et al. Benchmarking AlphaFold-enabled molecular docking predictions for antibiotic discovery. Mol. Syst. Biol. 18, e11081 (2022).

    Article  Google Scholar 

  59. Trachman, R. J. III et al. Structure and functional reselection of the Mango-III fluorogenic RNA aptamer. Nat. Chem. Biol. 15, 472–479 (2019).

    Article  Google Scholar 

  60. Wong, F. et al. Supporting code for: Deep generative design of RNA aptamers using structural predictions. Zenodo https://doi.org/10.5281/zenodo.13892413 (2024).

  61. Trachman, R. J. & Ferre-D'Amare, A. R. Structure of the Mango-III fluorescent aptamer bound to YO3-biotin. Protein Data Bank https://doi.org/10.2210/pdb6UP0/pdb (2019).

Download references

Acknowledgements

This work was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under award K25AI168451 (to F.W.), the Swiss National Science Foundation under grant number SNSF_ 203071 (to A.K.), the National Science Foundation Graduate Research Fellowship (to A.Z.W.), the Research Grants Council of the Hong Kong Special Administrative Region, China (projects CUHK 14222922 and RGC GRF 2151185 to I.K. and project CUHK 24204023 to Y.L.), a grant from the Innovation and Technology Commission of the Hong Kong Special Administrative Region, China (projects GHP/065/21SZ, IDBF24ENG06 and ITS/247/23FP to Y.L.), the National Key R&D Program of China (project 2022ZD0160101 to Y.L.) and the Broad Institute of MIT and Harvard (to J.J.C.). This work is part of the Antibiotics-AI Project, which is directed by J.J.C. and supported by the Audacious Project, Flu Lab, LLC, the Sea Grape Foundation, R. Zander and H. Wyss for the Wyss Foundation, and an anonymous donor. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

F.W. conceived research, performed or directed all experiments, wrote the paper and supervised research. D.H. and L.H. developed RhoDesign and performed computational analyses, with contributions from J.W., Z.H., Q.Y. and I.K. A.K. and A.Z.W. conceived research and performed experiments and analyses. S.O. and A.L. performed experiments. J.R., W.J., T.Z., K.I. and J.X.C. performed analyses. S.Z. conceived research and performed analyses. Y.L. conceived research, performed or directed all analyses and supervised research. J.J.C. conceived and supervised research. All authors assisted with manuscript editing.

Corresponding authors

Correspondence to Yu Li or James J. Collins.

Ethics declarations

Competing interests

J.J.C. is the founding scientific advisory board chair of Integrated Biosciences. F.W. is a co-founder of Integrated Biosciences. S.O. has an equity interest in Integrated Biosciences. The other authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Jianyi Yang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Jie Pan, in collaboration with the Nature Computational Science team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Structural fidelity of RhoFold’s Mango-III prediction, conserved sequence motifs in Mango aptamers, and comparison to aptamers 1-4.

a, (Left) RhoFold-predicted 3D structure of Mango-III, aligned to the ground truth structure for Mango-III in PDB 6UP0. (Right) AlphaFold 3-predicted 3D structure of Mango-III, aligned to the ground truth structure for Mango-III in PDB 6UP0. b, Comparison of aptamers 1-4’s sequences against Mango sequences. Here, conserved sequence motifs in Mango aptamers are indicated in red.

Extended Data Fig. 2 AlphaFold 3-predicted 3D and RhoFold-predicted secondary structures.

a, Predicted 3D structures for aptamers 1-4 generated using AlphaFold 3, as detailed in the Methods—RNA 3D structure prediction. RMSD, TM-score, and pLDDT values for each structure as compared to the ground truth structure for Mango-III in PDB 6UP0 are shown. b, Secondary structures for Mango-III and aptamers 1-4, as generated based on the corresponding PDB structure (6UP0; Mango-III) or RhoFold predictions (aptamers 1-4), as detailed in the Methods—RNA secondary structure prediction.

Supplementary information

Supplementary Information

Supplementary Figs. 1 and 2 and Tables 1–3.

Reporting Summary

Peer Review File

Supplementary Data 1

Generated and tested RNA sequences, in addition to model predictions of fluorescence activity.

Source data

Source Data Figs. 1 and 2

Statistical source data for Figs. 1 and 2.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wong, F., He, D., Krishnan, A. et al. Deep generative design of RNA aptamers using structural predictions. Nat Comput Sci 4, 829–839 (2024). https://doi.org/10.1038/s43588-024-00720-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-024-00720-6

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing