Skip to main content
Log in

Cluster Computing for Determining Three-Dimensional Protein Structure

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Determining the three-dimensional structure of proteins is crucial to efficient drug design and understanding biological processes. One successful method for computing the molecule’s shape relies on inter-atomic distance bounds provided by Nuclear Magnetic Resonance spectroscopy. The accuracy of computed structures as well as the time required to obtain them are greatly improved if the gaps between the upper and lower distance-bounds are reduced. These gaps are reduced most effectively by applying the tetrangle inequality, derived from the Cayley-Menger determinant, to all atom-quadruples. However, tetrangle-inequality bound-smoothing is an extremely computation intensive task, requiring O(n4) time for an n-atom molecule. To reduce computation time, we propose a novel coarse-grained parallel algorithm intended for a Beowulf-type cluster of PCs. The algorithm employs pn/6 processors and requires O(n4/p) time and O(p2) communications, where n is the number of atoms in a molecule. The number of communications is at least an order of magnitude lower than in the earlier parallelizations. Our implementation utilized processors with at least 59% efficiency (including the communication overhead)—an impressive figure for a non-embarrassingly parallel problem on a cluster of workstations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. C. B. Anfinsen. Principles that govern the protein folding chains. Science, 181:233–230, 1973.

    Google Scholar 

  2. A. Aszödi, M. J. Gradwell, and W. R. Taylor. Global fold determination from a small number of distance restraints. Journal of Molecular Biology, 251:308–326, 1995.

    Article  PubMed  Google Scholar 

  3. Z. Baranyai. On the factorisation of the complete uniform hypergraph. In Infinite and Finite Sets, A. Hajnal, T. Rado, and V. T. Sos, eds., pp. 91–108. North-Holland, Amsterdam, 1975.

    Google Scholar 

  4. D. J. Becker, T. Sterling, D. Savarese, E. Dorband, U. A. Ranawake, and C. V. Packer. BEOWULF: A Parallel Workstation for Scientific Computation. In Proceedings of the 1995 International Conference on Parallel Processing (ICPP), pp. 11–14, 1995.

  5. H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. 1Weissig, I. Shindyalov, and P. E. Bourne. The protein data bank. Nucleic Acids Research, 28:235–242, 2000.

    Article  Google Scholar 

  6. T. Beth. Algebraishce Auflögsalorithmen für einige unendliche Familien von 3-Designs. Le Matematiche, 29:105–135, 1974.

    Google Scholar 

  7. T. Beth, D. Jungnickel, and H. Lenz. Design Theory. Cambridge University Press, Cambridge, 1999.

    Google Scholar 

  8. L. M. Blumenthal. Theory and Applications of Distance Geometry. Chelsea Publishing Company, Bronx, New York, 1970.

    Google Scholar 

  9. A. E. Brouwer. Optimal packings of K4’s into a Kn. Journal of Combinatorial Theory, 26:278–297, 1979.

    Article  Google Scholar 

  10. A. T. Brünger and M. Nilges. Computational challenges for macromolecular structure determination by X-ray crystallography and solution NMR-spectroscopy. Quarterly Review of Biophysics, 26:49–125, 1993.

    Google Scholar 

  11. K. M. Chandy and J. Misra. Distributed computation on graphs: Shortest path algorithms. Communications of the ACM, 25:833–837, 1982.

    Article  Google Scholar 

  12. Y. M. Chee, C. J. Colbourn, S. C. Furino, and D. L. Kreher. Large sets of disjoint t-designs. Australian Journal of Combinatorics, 2:111–119, 1990.

    Google Scholar 

  13. L. G. Chouinard. Partitions of the 4-subsets of a 13-set into disjoint projective planes. Discrete Mathematics, 45:297–300, 1983.

    Article  Google Scholar 

  14. F. E. Cohen and I. D. Kuntz. Tertiary structure prediction. In G. D. Fasman, ed., Prediction of Protein Structure and the Principles of Protein Conformation, pp. 647–705. Plenum Press, New York, 1989.

    Google Scholar 

  15. T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, 1996.

  16. T. E. Creighton. Proteins Structures and Molecular Properties. W. F. Freeman, 1992.

  17. G. M. Crippen. A novel approach to the calculation of conformation: Distance geometry. Journal of Computational Physiology, 24:96–107, 1977.

    Google Scholar 

  18. G. M. Crippen and T. F. Havel. Distance Geometry and Molecular Conformation. Research Studies Press Ltd., Taunton, Somerset, England, 1988.

    Google Scholar 

  19. E. Dekel, D. Nassimi, and S. Sahni. Parallel matrix and graph algorithms. SIAM Journal of Computing, 10:657–675, 1981.

    Article  Google Scholar 

  20. N. Deo and P. Micikevicius. Generating edge-disjoint sets of quadruples in parallel for the molecular conformation problem. Congressus Numerantium, 143:81–96, 2000.

    Google Scholar 

  21. N. Deo and P. Micikevicius. On cyclic one-factorization of complete 3-uniform hypergraphs. Congressus Numerantium, to appear, 2003.

  22. N. Deo, C. Y. Pang, and P. E. Lord. Two parallel algorithms for shortest path problems. In Proceedings of the International Conference on Parallel Computing pp. 244–253, 1980.

  23. E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Mathematik, Vol. 1, pp. 269–271, 1959.

    Article  Google Scholar 

  24. A. W. M. Dress and T. F. Havel. Shortest-path problems and molecular conformation. Discrete Applied Mathematics, 19:129–14, 1988.

    Article  Google Scholar 

  25. P. L. Easthope and T. F. Havel. Computational experience with an algorithm for tetrangle-inequality bound-smoothing. Bulletin of Mathematical Biology, 51:173–194, 1989.

    Article  PubMed  Google Scholar 

  26. R. W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 5(6):345, 1962.

    Article  Google Scholar 

  27. A. Grama, A. Gupta, G. Karypis, and V. Kumar. An Introduction to Parallel Computing: Design and Analysis of Algorithms, 2nd ed. Pearson Addison Wesley, 2003.

  28. P. Güntert. Structure calculation of biological macromolecules from NMR data. Quarterly Reviews of Biophysics, 31:145–237, 1998.

    Article  PubMed  Google Scholar 

  29. P. Güntert. Automated NMR protein structure calculation with CYANA. In (A. K. Downing ed.) Protein NMR Techniques 2nd ed. Humana Press, Totowa, New Jersey, 2004.

    Google Scholar 

  30. T. F. Havel. The sampling properties of some distance geometry algorithms applied to unconstrained polypeptide chains: A study of 1830 independently computed conformations. Biopolymers, 29:1565–1585, 1990.

    Article  Google Scholar 

  31. T. F. Havel. An evaluation of computational strategies for use in the determination of protein structure from distance constraints obtained by nuclear magnetic resonance. Prog. Biophys. Mol. Biol., 56:43–78, 1991.

    Article  PubMed  Google Scholar 

  32. T. F. Havel. Metric matrix embedding in protein structure calculations, NMR spectra analysis, and relaxation theory. Magnetic Resonance in Chemistry, 41:s37–s50, 2003.

    Article  Google Scholar 

  33. T. F. Havel, I. D. Kuntz, and G. M. Crippen. The theory and practice of distance geometry. Bulletin of Mathematical Biology, 45:665–720, 1983.

    Article  Google Scholar 

  34. T. F. Havel and K. Wüthrich. A distance geometry program for determining the structures of small proteins and other macromolecules from nuclear magnetic resonance measurements of intramolecular 1H-1H proximities in solution. Bull. Math. Biol., 46:673–698, 1984.

    Article  Google Scholar 

  35. B. A. Hendrickson. The molecule problem: Exploiting structure in global optimizations. SIAM Journal on Optimization, 5:835–857, 1955.

    Article  Google Scholar 

  36. N. Kumar, N. Deo, and R. Addanki. Empirical study of a tetrangle-inequality bound-smoothing algorithm. Congressus Numerantium, 117:15–31, 1996.

    Google Scholar 

  37. V. Kumar and V. Singh. Scalability of parallel algorithms for all-pairs shortest-path problem. Journal of Parallel and Distributed Computing, 13:124–138, 1991.

    Article  Google Scholar 

  38. R. Mathon. Searching for spreads and packings. In Geometry, Combinatorial Designs and Related Structures, J. W. P. Hirschfield, S. S. Magliveras, and M. S. de Resmini, eds., pp. 161–176. Cambridge University Press, Cambridge, 1997.

    Google Scholar 

  39. K. Menger. New foundation of Euclidean geometry. Amer. J. Math., 53:721–45, 1931.

    Google Scholar 

  40. P. Micikevicius. Parallel graph algorithms for molecular conformation and tree codes. Ph.D. Thesis, University of Central Florida, Orlando, FL., 2002.

    Google Scholar 

  41. S. B. Nabuurs, C. A. E. M. Spronk, E. Krieger, H. Maassen, G. Vriend, and G. W. Vuister. Quantitative evaluation of experimental NMR restraints. Journal of American Chemical Society, 125(39):12026–12034, 2003.

    Article  Google Scholar 

  42. R. C. Paige and C. P. Kruskal. Parallel algorithms for shortest path problems. In Proceedings of International Conference on Parallel Processing, pp. 14–19, 1989.

  43. K. Rajan. Parallel algorithms for the molecular conformation problem. PhD thesis, University of Central Florida, Orlando, FL., 1999.

    Google Scholar 

  44. K. Rajan and N. Deo. A parallel algorithm for bound-smoothing using tetrangle inequality. In Proceedings of the Tenth IASTED International Conference on Parallel and Distributed Computing and Systems, pp. 298–304, 28–31, Las Vegas, Nevada, Oct. 1998.

  45. K. Rajan and N. Deo. A parallel algorithm for bound-smoothing. In Proceedings of the 13th International Parallel Processing Symposium, pp. 645–652. San Juan, Puerto Rico, April 12–16, 1999.

  46. K. Rajan and N. Deo. Computational experience with a parallel algorithm for tetrangle inequality bound smoothing. Bulletin of Mathematical Biology, 61(5):987–1008, 1999.

    Article  Google Scholar 

  47. K. Rajan, N. Deo, and N. Kumar. Parallel tetrangle-inequality bound smoothing on a cluster of workstations. Congressus Numerantium, 124:211–220, 1997.

    MathSciNet  Google Scholar 

  48. K. Rajan, N. Deo, and N. Kumar. Generating disjoint t-(v, k, 1) packings in parallel. Congressus Numerantium, 131:5–18, 1998.

    Google Scholar 

  49. D. Ridge, D. Becker, P. Merkey, and T. Sterling. Beowulf: Harnessing the power of parallelism in a pile-of-PCs. In Proceedings of IEEE Aerospace Conference, 1997.

  50. D. K. Searls. Grand challenges in computational biology. In S. L. 2Salzberg, D. K. Searls, and S. Kasif, eds. Computational Models in Molecular Biology, Elsevier, 1998.

  51. D. J. Skillicorn, M. D. Hill, and W. F. McColl. Questions and answers about BSP. Scientific Programming, 6(3):249–274, 1997.

    Google Scholar 

  52. M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra. MPI: The Complete Reference. MIT Press, Cambridge, Massachusetts, 1996.

    Google Scholar 

  53. T. Sterling, J. Salmon, D. Becker, and D. F. Savarese. How to Build a Beowulf. MIT Press, 1999.

  54. T. Sterling and D. Savarese. A coming of age for Beowulf-class computing. Lecture Notes in Computer Science, 1685:78–88, 1999.

    Google Scholar 

  55. L. G. Valiant. A bridging model for parallel computation. Communications of the ACM, 33(8):103–111, 1990.

    Article  Google Scholar 

  56. W. D. Wallis. Combinatorial Designs. Marcel Dekker, Inc., New York, 1998.

    Google Scholar 

  57. M. S. Warren, D. J. Becker, M. P. Goda, J. K. Salmon, and T. Sterling. Parallel supercomputing with commodity components. In H. R. Arabnia, ed., In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’97), pp. 1372–1381, 1997.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paulius Micikevicius.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Micikevicius, P., Deo, N. Cluster Computing for Determining Three-Dimensional Protein Structure. J Supercomput 34, 243–271 (2005). https://doi.org/10.1007/s11227-005-1168-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-005-1168-0

Keywords

Navigation