Abstract
This article presents in detail our novel proposed methodology for detecting similarity between or among three dimensional protein structures. The innovation of our algorithm relies on the fact that during the similarity process, it has the ability to combine many attributes together and fulfill lots of preconditions, which are extensively discussed throughout the paper. Our concept is also supported by an efficient and effective indexing scheme, that provides convincing results comparing to other known methods.
Similar content being viewed by others
References
Alexandrov NN (1996) SARFing the PDB. Protein Eng 9:727–732
Bachar O, Fischer D, Nussinov R, Wolfson H (1993) A computer vision based technique for 3D sequence-independent structural comparison of proteins. Protein Eng 6:279–288
Bashton M, Chothia C (2007) The generation of new protein functions by the combination of domains. Structure 15:85–99
Berman HM et al (2007) The protein data bank. Nucleic Acids Res 28:235–242
Budowski-Tal I, Nov Y, Kolodny R (2010) FragBag, an accurate representation of protein structure, retrieves stuctural neighbors from the entire PDB quickly and accurately. Proc Natl Acad Sci USA 107:3481–3486
Can T, Wang YF (2004) Protein structure alignment and fast similarity search using local shape signatures. J Bioinform Comput Biol 2:215–239
Carugo O, Pongor S (2002) Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison. J Mol Biol 315:887–898
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd international conference on very large databases (VLDB)
Cohen FE, Sternberg MJE (1980) Use of chemically derived distance constraints in the prediction of protein structure with myoglobin as an example. J Mol Biol 137:9–22
Dror O, Benyamini H, Nussinov R, Wolfson HJ (2003) Multiple structural alignment by secondary structures: algorithm and applications. Protein Sci 12:2492–507
Fischer D, Elofsson A, Rice D, Eisenberg D (1996) Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In: Pacific symposium on biocomputing, pp 300–318
Fong JH, Geer LY, Panchenko AR, Bryant SH (2007) Modeling the evolution of protein domain architectures using maximum parsimony. J Mol Biol 366:307–315
Gan HH et al (2002) Analysis of protein sequence/structure similarity relationships. Biophys J 83:2781–2791
Gibrat JF, Madej T, Bryant SH (1996) Surprising similarities in structure comparison. Curr Opin Struct Biol 6:377–385
Griep S, Hobohm U (2010) PDBselect 1992–2009 and PDBfilter-select. Nucleic Acids Res Database Issue 38:318–319
Guerler A, Knapp EW (2008) Novel protein folds and their nonsequential structural analogs. Protein Sci 17:1374–1382
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD conference, p 4757
Holm L, Sander C (1993) Protein structure comparison by alignment of distance matrices. J Mol Biol 233:123–138
Koehl P (2001) Protein structure similarities. Curr Opin Struct Biol 11:348–353
Kolbeck B, May P, Schmidt-Goenner T, Steinke T, Knapp EW (2006) Connectivity independent protein-structure alignment. BMC Bioinform 7:510–510
Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr 60:2256–2268
Konagurthu AS, Whisstock JC, Stuckey PJ, Lesk AM (2006) MUSTANG: a multiple structural alignment algorithm, proteins: structures. Funct Bioinform 64:559–574
Lesk AM (2004) Introduction to protein science: architecture, function and genomics. Oxford University Press, Oxford
Lichtarge O, Sowa ME (2002) Evolutionary predictions of binding surfaces and interactions. Curr Opin Struct Biol 12:21–27
Lupyan D, Leo-Macias A, Ortiz AR (2005) A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics 21:3255–3263
Madej T, Gibrat JF, Bryant SH (1995) Threading a database of protein cores. Proteins 23:356–369
Micheletti C, Orland H (2009) MISTRAL: a tool for energy-based multiple structural alignment of proteins. Oxf Univ Press 20:2663–9
Mosimann SC, Ardelt W, James MNG (1994) Refined 1.7 a X-ray crystallographic structure of P-30 protein, an amphibian ribonuclease with anti-tumor activity. J Mol Biol 236:1141–1153
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453
Ortiz AR, Strauss CEM, Olmea O (2002) MAMMOTH: an automated method for model comparison. Protein Sci 11:2606–2621
Park C, Park S, Kim D, Park S, Sung M, Lee H, Shin J, Hwang C (2006) Fast protein structure alignment algorithm based on local geometric similarity. In: MICAI 2006, LNAI 4293, pp 1179–1189
Potestio R, Aleksiev T, Pontiggia F, Cozzini S, Micheletti C (2010) ALADYN: a web server for aligning proteins by matching their large-scale motion. Nucleic Acids Res 38:W41–W45
Rogen P, Fain B (2003) Automatic classification of protein structure by using Gauss integrals. Proc Natl Acad Sci 100:119–124
Shapiro J, Brutlag D (2004) FoldMiner: structural motif discovery using an improved superposition algorithm. Protein Sci 13:278–294
Shatsky M, Nussinov R, Wolfson HJ (2004) A method for simultaneous alignment of multiple protein structures. Proteins 56:143–156
Shindyalov IN, Bourne PE (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11:739–747
Stivala AD, Stuckey PJ, Wirth AI (2010) Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinform 11:446–463
Traina C, Traina AJM, Seeger B, Faloutsos C (2000) Slim-trees: high performance metric trees minimizing overlap between nodes. In: Proceedings of the seventh international conference on extending database technology (EDBT), pp 51–65
Veeramalai M, Ye Y, Godzik A (2008) TOPS++FATCAT: fast flexible structural alignment using constraints derived from TOPS+ Strings Model. BMC Bioinformatics 9:358
Xie L, Bourne PE (2008) Detecting evolutionary relationships across existing fold space. Proc Natl Acad Sci USA 105:5441–5446
Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19:246–255
Yuan X, Bystroff C (2005) Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics 21:1010–1019
Zen A, Carnevale V, Lesk AM, Micheletti C (2008) Correspondences between low-energy modes in enzymes: dynamics-based alignment of enzymatic functional families. Protein Sci 17:918–929
Zhi D, Krishna S, Cao H, Pevzner P, Godzik A (2006) Representing and comparing protein structures as paths in three-dimensional space. BMC Bioinform 7:460–475
Zhang L, Bailey J, Konagurthu AS, Ramamohanarao K (2010) A fast indexing approach for protein structure comparison. BMC Bioinform 11:S46
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Iakovidou, N., Tiakas, E., Tsichlas, K. et al. Going over the three dimensional protein structure similarity problem. Artif Intell Rev 42, 445–459 (2014). https://doi.org/10.1007/s10462-013-9416-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-013-9416-9