skip to main content
10.1145/3647817.3647818acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbsConference Proceedingsconference-collections
research-article

Transformers-RNP: Predicting the mutation effect on the stability of Protein-RNA complex with deep learning-based model

Published: 11 April 2024 Publication History

Abstract

Protein-nucleic acid interactions, pivotal to the pathogenesis of certain diseases, have been the focus of intensive study over the years. Protein missense mutations can impact protein-nucleic acid binding affinity. Existing machine learning tools for predicting these effects rely heavily on protein structural information, limiting their applicability when only protein sequences are available. In this study, a deep learning-based model, Transformers-RNP, was designed to work solely on sequential information and mutation data. To enhance performance, we integrated HHblits-predicted evolutionarily conserved properties into the model. We trained Transformers-RNP on a recently published database for protein-nucleic acid binding affinities, ProNAB. Our model obtained a Root Mean Square Error (RMSE) of 0.809 Kcal/mol and a Pearson's correlation coefficient of 0.728 on the test dataset. Furthermore, a case study was performed on a protein-nucleic acid complex with single-point mutation. Notably, our model's weights were focused on the interaction region. The results show that Transformers-RNP serves as a robust tool for discerning the mutation effects on protein-RNA interaction sites.

References

[1]
Thomas A. Steitz. 1990. Structural studies of protein–nucleic acid interaction: the sources of sequence-specific binding. Quarterly Reviews of Biophysics 23, 3 (1990), 205-280. https://doi.org/10.1017/S0033583500005552
[2]
Hafumi Nishi, Manoj Tyagi, Shaolei Teng, Benjamin A. Shoemaker, Kosuke Hashimoto, Emil Alexov, Stefan Wuchty and Anna R. Panchenko. 2013. Cancer missense mutations alter binding properties of proteins and their interaction networks. PLoS One 8, 6 (2013), e66273. https://doi.org/10.1371/journal.pone.0066273
[3]
Helen Hwang and Sua Myong. 2014. Protein induced fluorescence enhancement (PIFE) for probing protein–nucleic acid interactions. Chemical Society Reviews 43, 4 (2014), 1221-1229. https://doi.org/10.1039/C3CS60201J
[4]
Thanh Binh Nguyen, Yoochan Myung, Alex G C de Sá, Douglas E V Pires and David B Ascher. 2021. mmCSM-NA: accurately predicting effects of single and multiple mutations on protein–nucleic acid binding affinity. NAR Genomics and Bioinformatics 3, 4 (2021). https://doi.org/10.1093/nargab/lqab109
[5]
Gen Li, Shailesh Kumar Panday, Yunhui Peng and Emil Alexov. 2021. SAMPDI-3D: predicting the effects of protein and DNA mutations on protein–DNA interactions. Bioinformatics 37, 21 (2021), 3760-3765. https://doi.org/10.1093/bioinformatics/btab567
[6]
Javier Delgado, Leandro G. Radusky, Damiano Cianferoni and Luis Serrano. 2019. FoldX 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics (Oxford, England) 35, 20 (2019), 4168-4169. https://doi.org/10.1093/bioinformatics/btz184
[7]
Ning Zhang, Haoyu Lu, Yuting Chen, Zefeng Zhu, Qing Yang, Shuqin Wang and Minghui Li. 2020. PremPRI: Predicting the Effects of Missense Mutations on Protein–RNA Interactions. International Journal of Molecular Sciences 21, 15 (2020), 5560. https://doi.org/10.3390/ijms21155560
[8]
Rita Strack. 2019. Predicting RNA–protein binding affinity. Nature Methods 16, 6 (2019), 460-460. https://doi.org/10.1038/s41592-019-0445-4
[9]
Kannan Harini, Ambuj Srivastava, Arulsamy Kulandaisamy and M Michael Gromiha. 2021. ProNAB: database for binding affinities of protein–nucleic acid complexes and their mutants. Nucleic Acids Research 50, D1 (2021), D1528-D1534. https://doi.org/10.1093/nar/gkab848
[10]
Michael Remmert, Andreas Biegert, Andreas Hauser and Johannes Söding. 2012. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods 9, 2 (2012), 173-175. https://doi.org/10.1038/nmeth.1818
[11]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.
[12]
Warren L DeLano. 2002. Pymol: An open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr 40, 1 (2002), 82-92.
[13]
Zhe Liu, Weihao Pan, Weihao Li, Xuyang Zhen, Jisheng Liang, Wenxiang Cai, Fei Xu, Kai Yuan and Guan Ning Lin. 2022. Evaluation of the Effectiveness of Derived Features of AlphaFold2 on Single-Sequence Protein Binding Site Prediction. Biology 11, 10 (2022), 1454. https://doi.org/10.3390/biology11101454
[14]
Chao Fang, Yi Shang and Dong Xu. 2018. Improving Protein Gamma-Turn Prediction Using Inception Capsule Networks. Scientific Reports 8, 1 (2018), 15741. https://doi.org/10.1038/s41598-018-34114-2
[15]
Jimmy Lei Ba, Jamie Ryan Kiros and Geoffrey E Hinton. 2016. Layer normalization. arXiv:1607.06450. Retrieved from https://doi.org/10.48550/arXiv.1607.06450
[16]
Z. Q. John Lu. 2010. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Journal of the Royal Statistical Society Series A: Statistics in Society 173, 3 (2010), 693-694. https://10.1111/j.1467-985X.2010.00646_6.x
[17]
Nikhil Ketkar, Jojo Moolayil, Nikhil Ketkar and Jojo Moolayil. 2021. Introduction to pytorch. Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch (2021), 27-91. https://doi.org/10.1007/978-1-4842-5364-9_2
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
[19]
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18, 5 (2005), 602-610. https://doi.org/10.1016/j.neunet.2005.06.042
[20]
Nadav Brandes, Dan Ofer, Yam Peleg, Nadav Rappoport and Michal Linial. 2022. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics 38, 8 (2022), 2102-2110. https://doi.org/10.1093/bioinformatics/btac020
[21]
Peter W Rose, Bojan Beran, Chunxiao Bi, Wolfgang F Bluhm, Dimitris Dimitropoulos, David S Goodsell, Andreas Prlić, Martha Quesada, Gregory B Quinn and John D Westbrook. 2010. The RCSB Protein Data Bank: redesigned web site and web services. Nucleic acids research 39, suppl_1 (2010), D392-D401. https://doi.org/10.1093/nar/gkq1021
[22]
Yunhui Peng, Lexuan Sun, Zhe Jia, Lin Li and Emil Alexov. 2017. Predicting protein–DNA binding free energy change upon missense mutations using modified MM/PBSA approach: SAMPDI webserver. Bioinformatics 34, 5 (2017), 779-786. https://doi.org/10.1093/bioinformatics/btx698
[23]
John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Ellen Clancy, Michal Zielinski, Martin Steinegger, Michalina Pacholska, Tamas Berghammer, Sebastian Bodenstein, David Silver, Oriol Vinyals, Andrew W. Senior, Koray Kavukcuoglu, Pushmeet Kohli and Demis Hassabis. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583-589. https://doi.org/10.1038/s41586-021-03819-2

Index Terms

  1. Transformers-RNP: Predicting the mutation effect on the stability of Protein-RNA complex with deep learning-based model

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICBBS '23: Proceedings of the 2023 12th International Conference on Bioinformatics and Biomedical Science
    October 2023
    76 pages
    ISBN:9798400716140
    DOI:10.1145/3647817
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 April 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Computational Biology
    2. Deep Learning
    3. Protein-RNA Interaction
    4. Transformer

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Shanghai Jiao Tong University

    Conference

    ICBBS 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 25
      Total Downloads
    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media