skip to main content
10.1145/2649387.2649394acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper

Utilizing twilight zone sequence similarities to increase the accuracy of protein 3D structure comparison

Published:20 September 2014Publication History

ABSTRACT

Protein structure alignment is often used to infer a sequential alignment. The inverse problem of utilizing alignments of protein sequences to derive structural alignments has been extensively studied, but the applications are still scarce. Here we present pMatch - a novel algorithm that takes advantage of sequential profiles to compute extremely accurate protein structure alignments. Contrary to some claims that sequential information is only useful when aligning structures of proteins that share a high degree of sequence identity, we demonstrate that incorporating sequential information can increase accuracy of alignments even between proteins sharing only about 10% of identical residues. We use two widely established benchmarks to show that our method is capable of accurately aligning protein structures that exhibit substantial conformational changes, large residue insertions/deletions and circular permutations. A preliminary version of the pMatch Web server is available at http://bioinformatics.cs.uni.edu.

References

  1. Levitt, M. and Gerstein, M. 1998. A unified statistical framework for sequence comparison and structure comparison. Proc. Natl. Acad. Sci. U.S.A. 95, 5913--5920.Google ScholarGoogle ScholarCross RefCross Ref
  2. Zhang, Y. and Skolnick, J. 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302--2309.Google ScholarGoogle ScholarCross RefCross Ref
  3. Zemla, A. 2003. LGA - a Method for Finding 3D Similarities in Protein Structures. Nucleic Acids Res. 31, 3370--3374.Google ScholarGoogle ScholarCross RefCross Ref
  4. Holm, L. and Sander, C. 1993. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123--138.Google ScholarGoogle ScholarCross RefCross Ref
  5. Wohlers, I., Domingues, F. S., and Klau, G. W. 2010. Towards optimal alignment of protein structure distance matrices. Bioinformatics 26, 2273--2280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Caprara, A, Carr, R., Istrail, S., Lancia, G., and Walenz, B. 2004. 1001 optimal PDB structure alignments: integer programming methods for finding the maximum contact map overlap. J. Comput. Biol. 11, 27--52.Google ScholarGoogle ScholarCross RefCross Ref
  7. Menke, M., Berger, B., and Cowen, L. 2008. Matt: Local flexibility aids protein multiple structure alignment. PLOS Comput. Biol. 4, 88--99.Google ScholarGoogle ScholarCross RefCross Ref
  8. Csaba, G., Birzele, F., and Zimmer, R. 2008. Protein structure alignment considering phenotypic plasticity. Bioinformatics 24, i98--i104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Rocha, J., Segura, J., Wilson, R. C., and Dasgupta, S. 2009. Flexible structural protein alignment by a sequence of local transformations. Bioinformatics 25, 1625--1631. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Daniluk, P. and Lesyng, B. 2011. A novel method to compare protein structures using local descriptors. BMC Bioinformatics 12, 344.Google ScholarGoogle ScholarCross RefCross Ref
  11. Yuzhen, Y. and Godzik, A. 2003. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19, ii246--ii255.Google ScholarGoogle Scholar
  12. Jung, J. and Lee, B. 2001. Circularly permuted proteins in the protein structure database. Protein Sci. 10, 1881--1886.Google ScholarGoogle ScholarCross RefCross Ref
  13. Poleksic, A. 2009. Algorithms for optimal protein structure alignment. Bioinformatics 25, 2751--2756. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kolodny, R. and Linial, N. 2003. Approximate protein structural alignment in polynomial time. Proc. Natl. Acad. Sci. U.S.A. 101, 12201--12206.Google ScholarGoogle ScholarCross RefCross Ref
  15. Shindyalov, I. N. and Bourne, P. E. 1998. Protein Structure Alignment by Incremental Combinatorial Extension of the Optimum Path. Protein Eng. 11, 739--747.Google ScholarGoogle ScholarCross RefCross Ref
  16. Daniels, N. M., Nadimpalli, S., and Cowen, L. J. Formatt. 2012. Correcting Protein Multiple Structural Alignments by Incorporating Sequence Alignment. BMC Bioinformatics. 13, 259.Google ScholarGoogle ScholarCross RefCross Ref
  17. Thompson, J. D., Higgins, D. G., and Gibson, T. J. 1994. CLUSTAL-W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673--4680.Google ScholarGoogle ScholarCross RefCross Ref
  18. Shatsky, M., Nussinov, R., and Wolfson, H. 2006. Optimization of multiple-sequence alignment based on multiple-structure alignment. Proteins: Structure, Function and Bioinformatics 62, 209--217.Google ScholarGoogle ScholarCross RefCross Ref
  19. Wang, S., Ma, J., Peng, J., and Xu, J. 2013. Protein structure alignment beyond spatial proximity. Sci. Rep. 3, 1448.Google ScholarGoogle Scholar
  20. Poleksic, A. Detecting Complex Protein Structure Relationships, in review.Google ScholarGoogle Scholar
  21. Uliel, S., Fliess, A., Amir, A., and Unger, R. 1999. A simple algorithm for detecting circular permutations in proteins. Bioinformatics 15, 930--936.Google ScholarGoogle ScholarCross RefCross Ref
  22. Poleksic A. and Fienup, M. 2008. Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms. Bioinformatics 24, 1145--1153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mizuguchi, K., Deane, C. M., Blundell, T. L., and Overington, J. P. 1998. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci. 7, 2469--2471.Google ScholarGoogle ScholarCross RefCross Ref
  24. Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577--2637.Google ScholarGoogle ScholarCross RefCross Ref
  25. Wallqvist, A., Fukunishi, Y., Murphy, L. R., Fadel, A., and Levy, R. M., 2000. Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases. Bioinformatics 16, 988--1002.Google ScholarGoogle ScholarCross RefCross Ref
  26. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389--3393.Google ScholarGoogle ScholarCross RefCross Ref
  27. Andreeva, A., Prlić, A., Hubbard, T. J. P., and Murzin, A. G. 2007. SISYPHUS---structural alignments for proteins with non-trivial relationships. Nucleic Acids Res. 35, D253--D259.Google ScholarGoogle ScholarCross RefCross Ref
  28. Mayr, G., Domingues, F. S., and Lackner, P. 2007. Comparative analysis of protein structure alignments. BMC Struct. Biol. 7, 50.Google ScholarGoogle ScholarCross RefCross Ref
  29. Berbalk, C., Schwaiger, C. S., and Lackner, P. 2009. Accuracy analysis of multiple structure alignments. Protein Sci. 18, 2027--2035.Google ScholarGoogle ScholarCross RefCross Ref
  30. Pearson, W. R. and Lipman, D. J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85, 2444--2448.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Utilizing twilight zone sequence similarities to increase the accuracy of protein 3D structure comparison

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            BCB '14: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
            September 2014
            851 pages
            ISBN:9781450328944
            DOI:10.1145/2649387
            • General Chairs:
            • Pierre Baldi,
            • Wei Wang

            Copyright © 2014 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 20 September 2014

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper

            Acceptance Rates

            Overall Acceptance Rate254of885submissions,29%
          • Article Metrics

            • Downloads (Last 12 months)5
            • Downloads (Last 6 weeks)0

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader