Abstract
This paper describes approaches to improving the perfor- mance of one of the most common and increasingly important aspects of the Human Genome Project (HGP) — large-volume, batch comparison of DNA sequence data. This basic comparison operation, usually carried out by the well-known BLAST program on one subject sequence against the internationally-available databases of over 3 million target sequences, is already used hundreds of thousands of times each day by researchers around the world. At present, it is still used primarily in single query, or small batch query mode. As the entire sequence of the human genome nears completion, the area of functional genomics, and the use of micro- arrays of sets of genes, is coming to the fore. These developments will demand ever more efficient means of BLASTing sets of data that will make single processor implementation on powerful workstations infea- sible. We describe the three primary parallel components to BLAST. The first is at the sequence-to-sequence comparison level. The second parallelizes a single query across a partitioned and distributed database. And finally, the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases. The three methods may be employed alone or in concert. Our current implementation is described which parallelizes batch requests, and our plans for implementation of the other levels is also described. The results will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Altschul, T. Madden, A. Schäffer, J. Zhang, Z. Zhang, W. Miller, D. Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, ” Nucleic Acids Res. 25:3389–3402, 1997.
. M. Berks, “The C. elegans genome sequencing project, ” Genome Research, Volume 5, 1995, pp. 99–104.
J. A. Blake, J. E. Richardson, M. T. Davisson, J. T. Eppig and the Mouse Genome Informatics Group. “The Mouse Genome Database (MGD). A comprehensive public resource of genetic, phenotypic and genomic data, ” Nucleic Acids Res, Volume 25, Number 1, 1997, pp. 85–91.
Deparment of Energy, “Five Years of Progress in the Human Genome Project, ” Human Genome News, Volume 7, Numbers 3-4, September-December 1995. Available via the WWW from http://www.ornl.gov in TechResources/HumanGenome/publicat/hgn/v7n3/04progre.html (September, 1997).
D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag. “Job Scheduling Under the Portable Batch System, ” Lecture Nodes in Computer Science Vol. 949, 1995.
R. W. Hockney and C. R. Jesshope, Parallel Computers 2: Architecture, Programming, and Algorithms, IOP Publishing, 1988.
T. E. Scheetz, C. L. Birkett, T. A. Braun, D. Nishimura, V. C. Sheffield, M. B. Soares, T. JL. Casavant, Depts. of Electrical and Computer Engineering, Pediatrics, and Physiology and Biophysics, University of Iowa, Iowa City. “Informatics for preparation of EST reads in a mixed-tissue cDNA library setting, ” Proceedings of the 1998 meeting on Genome Mapping, Sequencing, and Biology, Cold Spring Harbor, New York, pp. 205.
M. B. Soares, G. Beck, B. Berger, C. L. Birkett, E. A. Black, M. F. Bonaldo, R. C. Braun, T. A. Braun, M. Donahue, S. Kaliannan, R. Kincaid, V. Miljokovic, K. J. Munn, D. Nishimura, K. T. Pedretti, T. E. Scheetz, L. H. Stier, T. L. Casavant, V. C. Sheffield, Depts. of Electrical and Computer Engineering, Pediatrics, and Physiology and Biophysics, University of Iowa, Iowa City. “A program for rat gene discovery and mapping, ” Proceedings of the 1998 meeting on Genome Mapping, Sequencing, and Biology, Cold Springs Harbor, New York, pp. 212.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pedretti, K., Casavant, T., Braun, R., Scheetz, T., Birkett, C., Roberts, C. (1999). Three Complementary Approaches to Parallelization of Local BLAST Service on Workstation Clusters. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 1999. Lecture Notes in Computer Science, vol 1662. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48387-X_29
Download citation
DOI: https://doi.org/10.1007/3-540-48387-X_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66363-8
Online ISBN: 978-3-540-48387-8
eBook Packages: Springer Book Archive