Abstract
Similarity measures for the protein structures are quite complex and require significant computational time. We propose a parallel approach to this problem to fully exploit the computational power of current CPU architectures. This paper summarizes experience and insights acquired from the parallel implementation of the SProt similarity method, its database access method, and also the wellknown TM-score algorithm. The implementation scales almost linearly with the number of CPUs and achieves 21.4× speedup on a 24-core system. The implementation is currently employed in the web application http://siret.cz/p3s.
Keywords
Download to read the full chapter text
Chapter PDF
References
Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., Thornton, J.M.: CATH–a hierarchic classification of protein domain structures. Structure (London, England: 1993) 5(8), 1093–1108 (1997)
Balaji, S., Srinivasan, N.: Use of a database of structural alignments and phylogenetic trees in investigating the relationship between sequence and structural variability among homologous proteins. Protein Eng. 14(4), 219–226 (2001)
Chothia, C., Lesk, A.M.: The relation between the divergence of sequence and structure in proteins. The EMBO Journal 5(4), 823–826 (1986)
Aung, Z., Tan, K.L.: Rapid 3D protein structure database searching using information retrieval techniques. Bioinformatics 20(7), 1045–1052 (2004)
Tung, C.H.H., Huang, J.W.W., Yang, J.M.M.: Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database. Genome Biol. 8(3), R31 (2007)
Sacan, A., Toroslu, H.I., Ferhatosmanoglu, H.: Integrated search and alignment of protein structures. Bioinformatics 24(24), 2872–2879 (2008)
Galgonek, J., Hoksza, D., Skopal, T.: SProt: sphere-based protein structure similarity algorithm. BMC Proteome Science 9(suppl. 1), S20 (2011)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Zhang, Y., Skolnick, J.: Scoring function for automated assessment of protein structure template quality. Proteins 57(4), 702–710 (2004)
Kabsch, W.: A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. A 32(5), 922–923 (1976)
Micó, M.L., Oncina, J., Vidal, E.: A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recognition Letters 15(1), 9–17 (1994)
Moreno-Seco, F., Micó, L., Oncina, J.: Extending LAESA Fast Nearest Neighbour Algorithm to Find the k Nearest Neighbours. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SSPR and SPR 2002. LNCS, vol. 2396, pp. 718–724. Springer, Heidelberg (2002)
Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)
Amdahl, G.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18-20, 1967, Spring Joint Computer Conference, pp. 483–485. ACM (1967)
Reinders, J.: Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O’Reilly Media, Inc. (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Galgonek, J., Kruliš, M., Hoksza, D. (2013). On the Parallelization of the SProt Measure and the TM-Score Algorithm. In: Caragiannis, I., et al. Euro-Par 2012: Parallel Processing Workshops. Euro-Par 2012. Lecture Notes in Computer Science, vol 7640. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36949-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-36949-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36948-3
Online ISBN: 978-3-642-36949-0
eBook Packages: Computer ScienceComputer Science (R0)