Abstract
A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than 0.6 and the highest TM-score found by one of the tested methods is higher than 0.5, there is a probability of 42% that TMalign failed to find TM-scores higher than 0.5, while the same probability is reduced to 2% if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of 0.5 is used.
In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28, 235–242 (2000)
Akutsu, T., Tashimo, H.: Protein structure comparison using representation by line segment sequences. In: Pac. Symp. Biocomput., pp. 25–40 (1996)
Alexandrov, N.N.: SARFing the PDB. Protein Eng. 9(9), 727–732 (1996)
Caprara, A., Lancia, G.: Structural alignment of large-size proteins via lagrangian relaxation. In: RECOMB 2002: Proceedings of the Sixth Annual International Conference on Computational Biology, pp. 100–108. ACM, New York (2002)
Comin, M., Guerra, C., Zanotti, G.: Proust: a comparison method of three-dimensional structure of proteins using indexing techniques. Journal of Computational Biology 11, 1061–1072 (2004)
Gerstein, M., Levitt, M.: Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures. In: Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, pp. 59–67. AAAI Press (1996)
Gibrat, J.F., Madej, T., Bryant, S.H.: Surprising similarities in structure comparison. Current Opinion in Structural Biology 6(3), 377–385 (1996)
Lancia, G., Carr, R., Walenz, B., Istrail, S.: 101 optimal pdb structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem. In: RECOMB 2001: Proceedings of the Fifth Annual International Conference on Computational Biology, pp. 193–202. ACM, New York (2001)
Singh, A.P., Brutlag, D.L.: Hierarchical protein structure superposition using both secondary structure and atomic representations. In: Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, pp. 284–293. AAAI Press (1997)
Subbiah, S., Laurents, D.V., Levitt, M.: Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Current Biology 3(3), 141–148 (1993)
Shindyalov, I.N., Bourne, P.E.: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering 11(9), 739–747 (1998)
Xie, L., Bourne, P.E.: Detecting evolutionary relationships across existing fold space, using sequence order-independent profile–profile alignments. PNAS 8(4), 5441–5446 (2008)
Zhang, Y., Skolnick, J.: Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic Acids Research 33(7), 2302–2309 (2005)
Pandit, S.B., Skolnick, J.: Fr-tm-align: a new protein structural alignment method based on fragment alignments and the tm-score. BMC Bioinformatics 9(1), 531 (2008)
Zhang, Y., Skolnick, J.: Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57(4), 702–710 (2004)
Zemla, A., Venclovas, Č., Moult, J., Fidelis, K.: Processing and analysis of casp3 protein structure predictions. Proteins: Structure, Function, and Bioinformatics 37(S3), 22–29 (1999)
Levitt, M., Gerstein, M.: A unified statistical framework for sequence comparison and structure comparison. Proceedings of the National Academy of sciences 95(11), 5913–5920 (1998)
Pirovano, W., Feenstra, K.A., Heringa, J.: The meaning of alignment: lessons from structural diversity. BMC Bioinformatics 9(1), 556 (2008)
Daniels, N.M., Nadimpalli, S., Cowen, L.J., et al.: Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment. BMC Bioinformatics 13(1), 1–8 (2012)
Wang, S., Ma, J., Peng, J., Xu, J.: Protein structure alignment beyond spatial proximity. Scientific Reports 3 (2013)
Mizuguchi, K., Deane, C.M., Blundell, T.L., Overington, J.P.: Homstrad: a database of protein structure alignments for homologous families. Protein Science 7(11), 2469–2471 (1998)
Zhang, Y., Skolnick, J.: Spicker: A clustering approach to identify near-native protein folds. Journal of Computational Chemistry 25(6), 865–871 (2004)
Balcan, M.F., Blum, A., Gupta, A.: Approximate clustering without the approximation. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, pp. 1068–1077 (2009)
Moakher, M.: Means and averaging in the group of rotations. SIAM Journal on Matrix Analysis and Applications 24(1), 1–16 (2002)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970)
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: Scop: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247(4), 536–540 (1995)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences 89(22), 10915–10919 (1992)
Eddy, S.R., et al.: Where did the blosum62 alignment score matrix come from? Nature Biotechnology 22(8), 1035–1036 (2004)
Xu, J., Zhang, Y.: How significant is a protein structure similarity with tm-score= 0.5? Bioinformatics 26(7), 889–895 (2010)
Yang, Y., Zhan, J., Zhao, H., Zhou, Y.: A new size-independent score for pairwise protein structure alignment and its application to structure classification and nucleic-acid binding prediction. Proteins: Structure, Function, and Bioinformatics 80(8), 2080–2088 (2012)
Kinch, L., Yong Shi, S., Cong, Q., Cheng, H., Liao, Y., Grishin, N.V.: Casp9 assessment of free modeling target predictions. Proteins: Structure, Function, and Bioinformatics 79(S10), 59–73 (2011)
Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: Probcons: Probabilistic consistency-based multiple sequence alignment. Genome Research 15(2), 330–340 (2005)
Shi, J., Blundell, T.L., Mizuguchi, K.: Fugue: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. Journal of Molecular Biology 310(1), 243–257 (2001)
Konagurthu, A.S., Whisstock, J.C., Stuckey, P.J., Lesk, A.M.: Mustang: a multiple structural alignment algorithm. Proteins: Structure, Function, and Bioinformatics 64(3), 559–574 (2006)
Rohl, C.A., Strauss, C.E., Misura, K., Baker, D.: Protein structure prediction using rosetta. Methods in Enzymology 383, 66–93 (2004)
Maadooliat, M., Gao, X., Huang, J.Z.: Assessing protein conformational sampling methods based on bivariate lag-distributions of backbone angles. Brief. Bioinform. (2012)
Sadowski, M., Taylor, W.: Evolutionary inaccuracy of pairwise structural alignments. Bioinformatics 28(9), 1209–1215 (2012)
Ye, Y., Godzik, A.: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19(suppl. 2), ii246–ii255 (2003)
Menke, M., Berger, B., Cowen, L.: Matt: local flexibility aids protein multiple structure alignment. PLoS Computational Biology 4(1), e10 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cui, X., Li, S.C., Bu, D., Li, M. (2013). Towards Reliable Automatic Protein Structure Alignment. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-40453-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40452-8
Online ISBN: 978-3-642-40453-5
eBook Packages: Computer ScienceComputer Science (R0)