Abstract
We have created a high-throughput grid for biological sequence analysis, which is freely accessible via bioinformatics Web services. The system allows the execution of computationally intensive sequence alignment algorithms, such as Smith-Waterman or hidden Markov model searches, with speedups up to three orders of magnitude over single-CPU installations. Users around the world can now process highly sensitive sequence alignments with a turnaround time similar to that of BLAST tools. The grid combines high-throughput accelerators at two bioinformatics facilities in different geographical locations. The tools include TimeLogic DeCypher boards, a Paracel GeneMatcher2 accelerator, and Paracel BlastMachines. The Sun N1 Grid Engine software performs distributed resource management. Clients communicate with the grid through existing open BioMOBY Web services infrastructure. We also illustrate bioinformatics grid strategies for distributed load balancing, and report several nontrivial technical solutions that may serve as templates for adaptation by other bioinformatics groups.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410 (1990)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In: Proc. of IEEE 77, pp. 257–286 (1989)
Gaasterland, T., Sensen, C.W.: Fully Automated Genome Analysis That Reflects User Needs and Preferences: A Detailed Introduction to the MAGPIE System Architecture. Biochimie 78, 302–310 (1996)
Stein, L.: Creating a Bioinformatics Nation. Nature 417, 119–120 (2002)
Chicurel, M.: Bioinformatics: Bringing It All Together. Nature 419, 751, 753, 755 (2002)
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid Information Services for Distributed Resource Sharing. In: 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–184. IEEE Press, New York (2001)
Curbera, F., Duftler, M., Khalaf, R., Mukhi, N., Nagy, W., Weerawarana, S.: Unraveling the Web Services Web - An Introduction to SOAP, WSDL, and UDDI. IEEE Internet Computing 6, 86–93 (2002)
Stevens, R.D., Robinson, A.J., Goble, C.A.: myGrid: Personalised Bioinformatics on the Information Grid. Bioinformatics Suppl. 1, i302–i304 (2003)
Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., Brass, A.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal 40, 532–551 (2001)
Hass, L., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources. IBM Systems Journal 40, 489–511 (2001)
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Global Grid Forum (2002)
Wilkinson, M.D., Links, M.: BioMOBY: an Open-source Biological Web Services Proposal. Bioinformatics 3, 331–341 (2002)
Sun N1 Grid Engine 6, http://www.sun.com/software/gridware
MOBY Tools, http://mobycentral.icapture.ubc.ca/applets
The Common Gateway Interface, http://hoohoo.ncsa.uiuc.edu/cgi
GNU Wget, http://www.gnu.org/software/wget
BioMOBY in Java, http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a Tool for the Composition and Enactment of Bioinformatics Workflows. Bioinformatics 20, 3045–3054 (2004)
Turinsky, A.L., Ah-Seng, A.C., Gordon, P.M.K., Stromer, J.N., Taschuk, M.L., Xu, E.W., Sensen, C.W.: Bioinformatics Visualization and Integration with Open Standards: The Bluejay Genomic Browser. Silico Biol. 5, 187–198 (2005)
MOBY Clients, http://biomoby.open-bio.org/index.php/moby-clients
National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov
Gentzsch, W.: Grid Computing: A Vendor’s Vision. In: Proc. of CCGrid, pp. 290–295 (2002)
Berman, F., Fox, G., Hey, T.: Grid Computing: Making the Global Infrastructure a Reality. Wiley, London (2003)
Foster, I., Kesselman, C.: Globus: A Metacomputing Infrastructure Toolkit. Supercomputer Applications 11, 115–128 (1997)
Frey, J., Tannenbaum, T., Livny, M., Foster, I.T., Tuecke, S.: Condor-G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing 5, 237–246 (2002)
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid. In: Proc. of HPC ASIA, pp. 283–289 (2000)
Gannon, D., Bramley, R., Fox, G., Smallen, S., Rossi, A., Ananthakrishnan, R., Bertrand, F., Chiu, K., Farrellee, M., Govindaraju, M., Krishnan, S., Ramakrishnan, L., Simmhan, Y., Slominski, A., Ma, Y., Olariu, C., Rey-Cenvaz, N.: Programming the Grid: Distributed Software Components, P2P and Grid Web Services for Scientific Applications. J. Cluster Computing 5, 325–336 (2002)
Lord, P., Bechhofer, S., Wilkinson, M.D., Schiltz, G., Gessler, D., Hull, D., Goble, C., Stein, L.: Applying Semantic Web Services to Bioinformatics: Experiences Gained, Lessons Learnt. In: Proc. of 3rd Semantic Web Conference, pp. 350–364 (2004)
Rocco, D., Critchlow, T.: Automatic Discovery and Classification of Bioinformatics Web Sources. Bioinformatics 19, 1927–1933 (2003)
Kelly, N., Jithesh, P.V., Simpson, D.R., Donachy, P., Harmer, T.J., Perrott, R.H., Johnston, J., Kerr, P., McCurley, M., McKee, S.: Bioinformatics Data and the Grid: The GeneGrid Data Manager. In: Proc. of UK e-Science All Hands Meeting, pp. 571–578 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Wang, C., Gordon, P.M.K., Turinsky, A.L., Burgess, J., Dalton, T., Sensen, C.W. (2007). Combining a High-Throughput Bioinformatics Grid and Bioinformatics Web Services. In: Dubitzky, W., Schuster, A., Sloot, P.M.A., Schroeder, M., Romberg, M. (eds) Distributed, High-Performance and Grid Computing in Computational Biology. GCCB 2007. Lecture Notes in Computer Science(), vol 4360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69968-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-69968-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69841-8
Online ISBN: 978-3-540-69968-2
eBook Packages: Computer ScienceComputer Science (R0)