ABSTRACT
Protein structure alignment is often used to infer a sequential alignment. The inverse problem of utilizing alignments of protein sequences to derive structural alignments has been extensively studied, but the applications are still scarce. Here we present pMatch - a novel algorithm that takes advantage of sequential profiles to compute extremely accurate protein structure alignments. Contrary to some claims that sequential information is only useful when aligning structures of proteins that share a high degree of sequence identity, we demonstrate that incorporating sequential information can increase accuracy of alignments even between proteins sharing only about 10% of identical residues. We use two widely established benchmarks to show that our method is capable of accurately aligning protein structures that exhibit substantial conformational changes, large residue insertions/deletions and circular permutations. A preliminary version of the pMatch Web server is available at http://bioinformatics.cs.uni.edu.
- Levitt, M. and Gerstein, M. 1998. A unified statistical framework for sequence comparison and structure comparison. Proc. Natl. Acad. Sci. U.S.A. 95, 5913--5920.Google ScholarCross Ref
- Zhang, Y. and Skolnick, J. 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302--2309.Google ScholarCross Ref
- Zemla, A. 2003. LGA - a Method for Finding 3D Similarities in Protein Structures. Nucleic Acids Res. 31, 3370--3374.Google ScholarCross Ref
- Holm, L. and Sander, C. 1993. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123--138.Google ScholarCross Ref
- Wohlers, I., Domingues, F. S., and Klau, G. W. 2010. Towards optimal alignment of protein structure distance matrices. Bioinformatics 26, 2273--2280. Google ScholarDigital Library
- Caprara, A, Carr, R., Istrail, S., Lancia, G., and Walenz, B. 2004. 1001 optimal PDB structure alignments: integer programming methods for finding the maximum contact map overlap. J. Comput. Biol. 11, 27--52.Google ScholarCross Ref
- Menke, M., Berger, B., and Cowen, L. 2008. Matt: Local flexibility aids protein multiple structure alignment. PLOS Comput. Biol. 4, 88--99.Google ScholarCross Ref
- Csaba, G., Birzele, F., and Zimmer, R. 2008. Protein structure alignment considering phenotypic plasticity. Bioinformatics 24, i98--i104. Google ScholarDigital Library
- Rocha, J., Segura, J., Wilson, R. C., and Dasgupta, S. 2009. Flexible structural protein alignment by a sequence of local transformations. Bioinformatics 25, 1625--1631. Google ScholarDigital Library
- Daniluk, P. and Lesyng, B. 2011. A novel method to compare protein structures using local descriptors. BMC Bioinformatics 12, 344.Google ScholarCross Ref
- Yuzhen, Y. and Godzik, A. 2003. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19, ii246--ii255.Google Scholar
- Jung, J. and Lee, B. 2001. Circularly permuted proteins in the protein structure database. Protein Sci. 10, 1881--1886.Google ScholarCross Ref
- Poleksic, A. 2009. Algorithms for optimal protein structure alignment. Bioinformatics 25, 2751--2756. Google ScholarDigital Library
- Kolodny, R. and Linial, N. 2003. Approximate protein structural alignment in polynomial time. Proc. Natl. Acad. Sci. U.S.A. 101, 12201--12206.Google ScholarCross Ref
- Shindyalov, I. N. and Bourne, P. E. 1998. Protein Structure Alignment by Incremental Combinatorial Extension of the Optimum Path. Protein Eng. 11, 739--747.Google ScholarCross Ref
- Daniels, N. M., Nadimpalli, S., and Cowen, L. J. Formatt. 2012. Correcting Protein Multiple Structural Alignments by Incorporating Sequence Alignment. BMC Bioinformatics. 13, 259.Google ScholarCross Ref
- Thompson, J. D., Higgins, D. G., and Gibson, T. J. 1994. CLUSTAL-W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673--4680.Google ScholarCross Ref
- Shatsky, M., Nussinov, R., and Wolfson, H. 2006. Optimization of multiple-sequence alignment based on multiple-structure alignment. Proteins: Structure, Function and Bioinformatics 62, 209--217.Google ScholarCross Ref
- Wang, S., Ma, J., Peng, J., and Xu, J. 2013. Protein structure alignment beyond spatial proximity. Sci. Rep. 3, 1448.Google Scholar
- Poleksic, A. Detecting Complex Protein Structure Relationships, in review.Google Scholar
- Uliel, S., Fliess, A., Amir, A., and Unger, R. 1999. A simple algorithm for detecting circular permutations in proteins. Bioinformatics 15, 930--936.Google ScholarCross Ref
- Poleksic A. and Fienup, M. 2008. Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms. Bioinformatics 24, 1145--1153. Google ScholarDigital Library
- Mizuguchi, K., Deane, C. M., Blundell, T. L., and Overington, J. P. 1998. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 7, 2469--2471.Google ScholarCross Ref
- Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577--2637.Google ScholarCross Ref
- Wallqvist, A., Fukunishi, Y., Murphy, L. R., Fadel, A., and Levy, R. M., 2000. Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases. Bioinformatics 16, 988--1002.Google ScholarCross Ref
- Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389--3393.Google ScholarCross Ref
- Andreeva, A., Prlić, A., Hubbard, T. J. P., and Murzin, A. G. 2007. SISYPHUS---structural alignments for proteins with non-trivial relationships. Nucleic Acids Res. 35, D253--D259.Google ScholarCross Ref
- Mayr, G., Domingues, F. S., and Lackner, P. 2007. Comparative analysis of protein structure alignments. BMC Struct. Biol. 7, 50.Google ScholarCross Ref
- Berbalk, C., Schwaiger, C. S., and Lackner, P. 2009. Accuracy analysis of multiple structure alignments. Protein Sci. 18, 2027--2035.Google ScholarCross Ref
- Pearson, W. R. and Lipman, D. J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85, 2444--2448.Google ScholarCross Ref
Index Terms
- Utilizing twilight zone sequence similarities to increase the accuracy of protein 3D structure comparison
Recommendations
In silico Protein Structure Comparison of Conotoxins with VI/VII Cysteine Framework
ICCBB '19: Proceedings of the 2019 3rd International Conference on Computational Biology and BioinformaticsConopeptides are small disulfide-rich peptides isolated from the venom of marine cone snails, and they are amongst the most interesting of the venom species. In this paper, in silico structural models and alignments of ω-conotoxin and different ...
Protein Structure Alignment: Is There Room for Improvement?
CISIS '12: Proceedings of the 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS)Recent years have seen rapid development of methods for approximate and optimal solutions to the protein structure alignment problem. Albeit slow, these methods can be extremely useful in assessing the accuracy of more efficient, heuristic algorithms. ...
Alpha-family of Conotoxins: An Analysis of Structural Determinants
ICCBB '19: Proceedings of the 2019 3rd International Conference on Computational Biology and BioinformaticsConopeptides are small, disulfide-rich polypeptides that have great potential as sources of possible drug candidates due to their activity against membrane receptors and ion channels. A challenge to the faster high-throughput in silico screening of ...
Comments