Backbone dihedral angles prediction servers for protein early-stage structure prediction

Tomasz Smolarczyk; Katarzyna Stapor; Irena Roterman-Konieczna

doi:10.1515/bams-2019-0034

Published by De Gruyter October 16, 2019

Backbone dihedral angles prediction servers for protein early-stage structure prediction

Tomasz Smolarczyk , Katarzyna Stapor and Irena Roterman-Konieczna

From the journal Bio-Algorithms and Med-Systems

https://doi.org/10.1515/bams-2019-0034

Showing a limited preview of this publication:

Abstract

Three-dimensional protein structure prediction is an important task in science at the intersection of biology, chemistry, and informatics, and it is crucial for determining the protein function. In the two-stage protein folding model, based on an early- and late-stage intermediates, we propose to use state-of-the-art secondary structure prediction servers for backbone dihedral angles prediction and devise an early-stage structure. Early-stage structures are used as a starting point for protein folding simulations, and any errors in this stage affect the final predictions. We have shown that modern secondary structure prediction servers could increase the accuracy of early-stage predictions compared to previously reported models.

Keywords: early stage; MUFOLD-SS; protein folding; secondary structure; SPIDER3

Ethical Approval: The conducted research is not related to either human or animal use.
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Employment or leadership: None declared.
Honorarium: None declared.
Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.
Conflict of interests: The authors declare no conflict of interest.

References

[1] Anfinsen CB. Principles that govern the folding of protein chains. Science 1973;181:223–30.10.1126/science.181.4096.223Search in Google Scholar

[2] Rost B, Sander C, Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol 1994;235:13–26.10.1016/S0022-2836(05)80007-5Search in Google Scholar

[3] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res 2000;28:235–42.10.1093/nar/28.1.235Search in Google Scholar

[4] The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2018;46:2699.10.1093/nar/gky092Search in Google Scholar

[5] The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2017;45:D158–69.10.1093/nar/gkw1099Search in Google Scholar

[6] Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinf 2018;19:482–94.10.1093/bib/bbw129Search in Google Scholar

[7] Shortle D. Prediction of protein structure. Curr Biol 2000;10:49–51.10.1016/S0960-9822(00)00290-6Search in Google Scholar

[8] Rost B. Rising accuracy of protein secondary structure prediction. In: Chasman D, editor. Protein structure determination, analysis, and modeling for drug discovery. New York: Dekker, 2003:207–49.10.1201/9780203911327.ch8Search in Google Scholar

[9] Rost B. Review: protein secondary structure prediction continues to rise. J Struct Biol 2001;134:204–18.10.1006/jsbi.2001.4336Search in Google Scholar PubMed

[10] Brylinski M, Konieczny L, Czerwonko P, Jurkowski W, Roterman I. Early-stage folding in proteins (in silico) sequence-to-structure relation. J Biomed Biotechnol 2005;2:65–79.10.1155/JBB.2005.65Search in Google Scholar PubMed PubMed Central

[11] Gadzała M, Dułak D, Kalinowska B, Baster Z, Bryliński M, Konieczny L, et al. The aqueous environment as an active participant in the protein folding process. J Mol Graph Modell 2019;87:227–39.10.1016/j.jmgm.2018.12.008Search in Google Scholar PubMed

[12] Heffernan R, Yang Y, Paliwal KK, Zhou Y. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 2017;33:2842–9.10.1093/bioinformatics/btx218Search in Google Scholar PubMed

[13] Fang C, Shang Y, Xu D. MUFOLD-SS: new deep inception-inside-inception networks for protein secondary structure prediction. Proteins 2018;86:592–8.10.1002/prot.25487Search in Google Scholar PubMed PubMed Central

[14] Kalinowska B, Alejster P, Sałapa K, Baster Z, Roterman I. Hypothetical in silico model of the early-stage intermediate in protein folding. J Mol Model 2013;19:4259–69.10.1007/s00894-013-1909-6Search in Google Scholar PubMed PubMed Central

[15] Roterman I. Modelling the optimal simulation path in the peptide chain folding-studies based on geometry of alanine heptapeptide. J Theor Biol 1995;177:283–8.10.1006/jtbi.1995.0245Search in Google Scholar PubMed

[16] Jurkowski W, Brylinski M, Konieczny L, Wiśniowski Z, Roterman I. Conformational subspace in simulation of early-stage protein folding. Proteins 2004;55:115–27.10.1002/prot.20002Search in Google Scholar PubMed

[17] Kalinowska B, Fabian P, Stąpor K, Roterman I. Statistical dictionaries for hypothetical in silico model of the early-stage intermediate in protein folding. J Comput Aided Mol Des 2015;29:609–18.10.1007/s10822-015-9839-2Search in Google Scholar PubMed PubMed Central

[18] Rose AS, Bradley AR, Valasatava Y, Duarte JM, Prlić A, Rose PW. NGL viewer: web-based molecular graphics for large complexes. Bioinformatics 2018;34:3755–8.10.1093/bioinformatics/bty419Search in Google Scholar PubMed PubMed Central

[19] Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, et al. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep. 2015;65:1147.10.1038/srep11476Search in Google Scholar PubMed PubMed Central

[20] Fang C, Shang Y, Xu D. Prediction of protein backbone torsion angles using deep residual inception neural networks. IEEE/ACM Trans Comput Biol Bioinf 2018;16:1020–8.10.1109/TCBB.2018.2814586Search in Google Scholar PubMed PubMed Central

[21] Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997;25:3389–402.10.1093/nar/25.17.3389Search in Google Scholar PubMed PubMed Central

[22] Fauchère J-L, Charton M, Kier LB, Verloop A, Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology. Int J Peptide Protein Res 1988;32:269–7810.1111/j.1399-3011.1988.tb01261.xSearch in Google Scholar PubMed

[23] Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2012;9:173–5.10.1038/nmeth.1818Search in Google Scholar PubMed

[24] Jiang Q, Jin X, Lee S-J, Yao S. Protein secondary structure prediction: a survey of the state of the art. J Mol Graph Modell 2017;76:379–402.10.1016/j.jmgm.2017.07.015Search in Google Scholar PubMed

[25] Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. 2016;6:18962.10.1038/srep18962Search in Google Scholar PubMed PubMed Central

[26] Lee J. Measures for the assessment of fuzzy predictions of protein secondary structure. Proteins 2006;65:453–62.10.1002/prot.21164Search in Google Scholar PubMed

[27] Brylinski M, Konieczny L, Roterman I. SPI – structure predictability index for protein sequences. In Silico Biol 2005;5:227–37.Search in Google Scholar

[28] Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett 2006;27:861–74.10.1016/j.patrec.2005.10.010Search in Google Scholar

[29] Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22:1658–9.10.1093/bioinformatics/btl158Search in Google Scholar PubMed

[30] Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010;26:680–2.10.1093/bioinformatics/btq003Search in Google Scholar PubMed PubMed Central

[31] Hollingsworth SA, Karplus PA. A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. BioMol Concepts 2010;1:271–83.10.1515/bmc.2010.022Search in Google Scholar

[32] Fabian P, Stąpor K. Developing a new SVM classifier for the extended ES protein structure prediction. In: 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), Prague, 2017.10.15439/2017F322Search in Google Scholar

[33] Smolarczyk T, Stapor K. Random forest classifier for early-stage protein structure prediction. Studia Inf 2018;39:37–54.Search in Google Scholar

[34] Barbara K, Fabian P, Stapor K, Roterman-Konieczna I. Statistical dictionaries for hypothetical in silico model of the early-stage intermediate in protein folding. J Comput Aided Mol Des 2015;29:609–18.10.1007/s10822-015-9839-2Search in Google Scholar PubMed PubMed Central

[35] Dietterich TG. Ensemble methods in machine learning. In: Multiple classifier systems. Berlin/Heidelberg: Springer Berlin Heidelberg, 2000:1–15.Search in Google Scholar

Received: 2019-07-24

Accepted: 2019-09-16

Published Online: 2019-10-16

Backbone dihedral angles prediction servers for protein early-stage structure prediction

Abstract

References

Journal and Issue

Articles in the same Issue